# [ Archived Post ] Ali Ghodsi, Lec 2: PCA (Ordinary, Dual, Kernel)

Please note that this post is for my own educational purposes.

The general solution of PCA.

The trace of a matrix. (PCA → linear mapping → we can also project out of the sample data to the same linear space → for non-linear manifold → this is not so simple)

For centered data, → we NEED TO center the data as well.

We might have the case → when a number of samples are much smaller than feature size → image data. (gene data → what should we do → projection into data space) → DUAL PCA.

For DUAL PCA → we already assume that the latent dimension is smaller.

Dual PCA → saves computational time.

Now Kernel PCA → higher dimension. (kernel → a filter like → so we are projecting the data into higher space) (Curse of dimension → hard to compute → and harder to do modeling → we need a lot of data to train).

Bless of dimensionality → structure of data is easier. (Change the data that looks much easier to separate).

Function analysis → mapping from one space to another space → larger dimension → K(x,y) → same as projecting into higher dimensional space. (the trick is recast your algorithm → only depend on dot product → save memory and miss the curse of dimensionality.).

KPCA → cannot do reconstruction.

KPCA → how it looks like in lower dimensional space.

Original space → to some feature space. (but do not calculate the feature space directly).