VC: High Dimensional PCA and Kernel PCA

Jeheonpark
The Startup
Published in
5 min readSep 5, 2020

--

This post assumes you already know what PCA is. If you don’t please check my previous post.

n < p case (High dimensional PCA)

If the feature space is bigger than the number of data points, our rank is determined by n-1 because it is centred by the mean. Therefore, the degree of freedom is reduced, we can calculate the last one by the previous n-1 data points because of the mean.

We will use one fact to efficiently calculate the high dimensional PCA. All eigenvectors of S are in the span of z(the centred vector of the original data).

The proof of z span u.

This proof starts with the eigenvalue decomposition of the scatter matrix. We can think of the inner product of S and u as the sum of vector inner products. We got u is some scalar product of z. The scalar coefficient consists of eigenvalue, eigenvector, and z transpose. We can reduce the computation with this fact from p x p eigenvalue problem to n x n eigenvalue problem.

We start with the eigenvalue decomposition of the scatter matrix S and it becomes the eigenvalue decomposition of matrix K because its vector product is different, it is from the different…

--

--

Jeheonpark
The Startup

Jeheon Park, Software Engineer at Kakao in South Korea