MLearning.ai
Published in

MLearning.ai

Principal Component Analysis(PCA) Simplified

source: author

Problem Statement

Pre-requisite

Benefits of Dimension Reduction

  • Consumption of less computational resources.
  • Faster running models.
  • Improvement of your model performance.
  • Better Data Visualization.

Applying PCA

  1. Manually calculating and generating the principal components. — PCA has a mathematical approach to it. We will generate principal components manually in order to fully understand the concept.
  2. Using the scikit-learn library — We’ll leverage the scikit-learn library which automatically outputs and generates the principal components for us. This is what you will ideally use when creating a machine learning model. But it is important to understand the concept first using method 1.

Steps to perform PCA

  1. Standardization
  2. Covariance Matrix
  3. Eigen Decomposition
  4. Sort By Eigen Values
  5. Choose your Principal Components

Standardization

Covariance Matrix

source:author
variance vs covariance: source- author
source- emathzone
covariance matrix: source -author
covariance matrix: source-author

EigenDecomposition

Example

source:author

Sort by Eigen Values and Choosing Principal Components

Manually calculating and generating the principal components

  1. Load your data.
data=np.array([
[6., 3., 2.],
[3., 2., 7.],
[5., 4., 2.],
[1., 4., 3.],
[7., 3., 1.0],
[5., 1., 8.],
[4., 2., 2.],
[8., 6., 6.],
[6., 3., 2.],
[7., 1., 1.]])

PCA using Scikit- Learn

  1. Load your data — we will use pandas inbuilt dataset for wine.

When to use PCA

  1. When you want to reduce the number of your variables but are not able to clearly identify the variables you want to remove.
  2. When you want to make sure your variables are independent of each other.

--

--

Data Scientists must think like an artist when finding a solution when creating a piece of code. ⚪️ Artists enjoy working on interesting problems, even if there is no obvious answer ⚪️ linktr.ee/mlearning 🔵 Follow to join our 28K+ Unique DAILY Readers 🟠

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Joan Ngugi

Big Data & Analytics, Data Science, Machine Learning, Data Engineering | ngugijoan.com