Data Science 365

Bring data into actionable insights.

Member-only story

An In-depth Guide to PCA with NumPy

7 min readFeb 27, 2023

--

Photo by Vincent Guth on Unsplash

Principal Component Analysis (hereafter, PCA) is one of the most popular dimensionality reduction techniques used in machine learning.

It is considered a linear dimensionality reduction technique as it finds a linear combination of input features in a lower dimensional form.

More precisely,

PCA is a linear dimensionality reduction technique that transforms the p number of input variables into a smaller k (k << p) number of uncorrelated variables called principal components by taking advantage of the existing correlations between the input variables in the dataset[ref¹]

ref¹: 3 Easy Steps to Perform Dimensionality Reduction Using Principal Component Analysis (PCA)

We’ve performed PCA so many times before in my previous articles. There, we always used the Scikit-learn popular PCA() function to perform PCA.

Today, we will apply PCA to a dataset without using the PCA() function, but using some specific functions in NumPy. The entire process seems to be tedious, but it is a great way to get an in-depth…

--

--

Data Science 365
Data Science 365
Rukshan Pramoditha
Rukshan Pramoditha

Written by Rukshan Pramoditha

3,000,000+ Views | BSc in Stats (University of Colombo, Sri Lanka) | Top 50 Data Science, AI/ML Technical Writer on Medium

No responses yet