Jing Wang

In sci-kit-learn, the pipeline chains a series of transformers with a final estimator. The intermediate transform steps are preparing the data, and the last step generates a scalar and differentiable estimation of the model. When I first time learned it, It blew my mind for its plugin and play, simple yet modular design philosophy.

In the meantime, it causes confusion that calling model.fit(X_train, y_train) means calling fit() on the last estimator step, and it actually does fit_transform to all previous transformer steps. There are a few methods needed inside your own class implementation, depending on how you use them. Here are the common instances.

https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

https://github.com/scikit-learn/scikit-learn/blob/7389dba/sklearn/pipeline.py#L239

--

--

SSD is a dimensionality reduction method formulated in my paper. Our aim is to solve an optimization problem

by sequentially searching for the projection along which the difference matrix(A) between two representations (X1, X2) is minimized. The denominator penalizes those noise dimensions with diminishing covariance (B). Loosely speaking, the…

--

--