Pipeline

1 min readSep 16, 2017

--

Python scikit-learn provides a Pipeline utility to help automate machine learning workflows.

Very useful for cleanly streamlining…

Data Transformation
Model Selection (can we pass an algorithm as a parameter?)
Hyperparameter Tuning
Save Pipeline

Example #1

Standardize Data
Learn a Linear Discriminant Analysis model

#keynotes: cross_val_score on pipeline is very efficient, but convert your metric as a score with make_scorer().

Example #2

Feature Extraction with Principal Component Analysis (3 features)
Feature Extraction with Statistical Selection (6 features)
Feature Union
Learn a Logistic Regression Model

#keynotes: you can use pipeline to transform data with pipeline.fit and pipeline.transform.

Example #3

Select Features
Imputation by 0
Imputation by mean
Feature Union

Example #4

Repeat Example #2
Grid Search on “n_components” and “k”
Use “best_estimator” for prediction

Example #5

Save pipeline/GridSearchCV
Load pipeline/GridSearchCV

Machine Learning

Eugine Kang

Written by Eugine Kang

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams