@@ -154,11 +154,11 @@ call is issued would be:
154154
155155 X -> X1;
156156 X1 -> b1 [constraint=false];
157- b1 -> X2 [label=modified] ;
158- X2 -> b2 [constraint=false]
159- b2 -> X3 [label=modified] ;
160- X3 -> b3 [constraint=false]
161- b3 -> y
157+ b1 -> X2;
158+ X2 -> b2 [constraint=false];
159+ b2 -> X3;
160+ X3 -> b3 [constraint=false];
161+ b3 -> y;
162162 }
163163
164164Another schema with some more complexity would be one where there is one primitive that
@@ -191,9 +191,10 @@ of actions would be:
191191 b1 -> b2 [style=invis];
192192
193193 subgraph cluster_1 {
194- X1 [label=X];
195- f1 [label=features];
196- X2 [label=X];
194+ {rank=same X1 f1}
195+ X1 [label=X group=c];
196+ f1 [label=features group=c];
197+ X2 [label=X group=c];
197198 f1 -> X1 [style=invis];
198199 X1 -> X2 [style=dashed];
199200 label = "Context";
@@ -204,8 +205,9 @@ of actions would be:
204205 {rank=same X features}
205206 features -> f1;
206207 X -> X1;
207- {X1 f1} -> b1 [constraint=false];
208- b1 -> X2 [label=encoded];
208+ X1 -> b1 [constraint=false];
209+ f1 -> b1 [constraint=false];
210+ b1 -> X2;
209211 X2 -> b2 [constraint=false]
210212 b2 -> y
211213 }
@@ -242,9 +244,9 @@ do its job:
242244 b0 -> b1 -> b2 [style=invis];
243245
244246 subgraph cluster_1 {
245- X1 [label=X];
246- f1 [label=features];
247- X2 [label=X];
247+ X1 [label=X group=c ];
248+ f1 [label=features group=c ];
249+ X2 [label=X group=c ];
248250 X1 -> f1 -> X2 [style=invis];
249251 X1 -> X2 [style=dashed];
250252 label = "Context";
@@ -256,12 +258,92 @@ do its job:
256258 X1 -> b0 [constraint=false];
257259 b0 -> f1;
258260 {X1 f1} -> b1 [constraint=false];
259- b1 -> X2 [label=encoded] ;
261+ b1 -> X2;
260262 X2 -> b2 [constraint=false]
261263 b2 -> y
262264 }
263265
264266
267+ JSON Annotations
268+ ----------------
269+
270+ Like primitives, Pipelines can also be annotated and stored as dicts or JSON files that contain
271+ the different arguments expected by the ``MLPipeline `` class, as well as the set hyperparameters
272+ and tunable hyperparameters.
273+
274+ Representing a Pipeline as a dict
275+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
276+
277+ The dict representation of an Pipeline can be obtained directly from an ``MLPipeline `` instance,
278+ by calling its ``to_dict `` method.
279+
280+ .. ipython :: python
281+
282+ pipeline.to_dict()
283+
284+ Notice how the dict includes all the arguments that used when we created the ``MLPipeline ``,
285+ as well as the hyperparameters that the pipeline is currently using and the complete specification
286+ of the tunable hypeparameters.
287+
288+ If we want to directly store the dict as a JSON we can do so by calling the ``save `` method
289+ with the path of the JSON file to create.
290+
291+ .. ipython :: python
292+
293+ pipeline.save(' pipeline.json' )
294+
295+ Loading a Pipeline from a dict
296+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
297+
298+ Similarly, once the we have a dict specification, we can load the Pipeline directly from it
299+ by calling the ``MLPipeline.from_dict `` method.
300+
301+ Bear in mind that the hyperparameter values and tunable ranges will be taken from the dict.
302+ This means that if we want to tweak the tunable hyperparameters to adjust it to a specific
303+ problem or dataset, we can do that directly on our dict representation.
304+
305+ .. ipython :: python
306+
307+ pipeline_dict = {
308+ " primitives" : [
309+ " sklearn.preprocessing.StandardScaler" ,
310+ " sklearn.ensemble.RandomForestClassifier"
311+ ],
312+ " hyperparameters" : {
313+ " sklearn.ensemble.RandomForestClassifier#1" : {
314+ " n_jobs" : - 1 ,
315+ " n_estimators" : 100 ,
316+ " max_depth" : 5 ,
317+ }
318+ },
319+ " tunable_hyperparameters" : {
320+ " sklearn.ensemble.RandomForestClassifier#1" : {
321+ " max_depth" : {
322+ " type" : " int" ,
323+ " default" : 10 ,
324+ " range" : [
325+ 1 ,
326+ 30
327+ ]
328+ }
329+ }
330+ }
331+ }
332+ pipeline = MLPipeline.from_dict(pipeline_dict)
333+ pipeline.get_hyperparameters()
334+ pipeline.get_tunable_hyperparameters()
335+
336+ .. note :: Notice how we skipped many items in this last dict representation and only included
337+ the parts that we want to be different than the default values. MLBlocks will figure out
338+ the rest of the elements directly from the primitive annotations on its own!
339+
340+ Like with the ``save `` method, the **MLPipeline ** class offers a convenience ``load `` method
341+ that allows loading the pipeline directly from a JSON file:
342+
343+ .. ipython :: python
344+
345+ pipeline = MLPipeline.load(' pipeline.json' )
346+
265347 .. _API Reference : ../api_reference.html
266348.. _primitives : ../primitives.html
267349.. _mlblocks.MLPipeline : ../api_reference.html#mlblocks.MLPipeline
0 commit comments