# [ Archived Post ] Ali Ghodsi, Lec 1: Principal Component Analysis

Please note that this post is for my own educational purpose.

PCA → dimension reduction. (reduce from 3D — 2D and such).

The points have less dimension → 3D to 2D. (All of the points are in the 2D paper → manifold → and from that manifold → 2D). (that is why we can reduce dimension without losing information). (many of the data → are on the manifold). (but PCA → linear combination of smaller data).

PCA → keep the most of the variance → find a linear combination → the basis vectors are orthogonal to one another → and the first vector keeps the most of the variance → not really robust to outliers.

Different notations. (find the set of basis vectors in the d dimension) (optimization is unbounded → need to have some constraint → lagrangian). (constraint on the length of U → this gives a basis vector).

That is our optimization function. (saddle point of the function).

The solution will be in between those two points.

The solution we get → becomes the definition of EVD. (there are multiple solutions → here the max eigenvalue → maximizes the variance → since we can replace the original objective function. (S if D*D → have most D eigenvalues and vectors → and D solutions → so get the max eigenvalues for greatest solution).

SVD → operation and what each component stands for. (another method of PCA → using SVD).

Take the noisy face data → now we are going to reduce the dimension. (noise are not the dominant signals → we can see the face and more).

Reconstructed image → without any noise.