Linear Discriminant Analysis (LDA)
Linear discriminant analysis (LDA) is a dimensionality reduction technique in which the goal is to project a dataset into lower dimensional space. Linear discriminant analysis is also known as Normal Discriminant Analysis (NDA), or Discriminant Function Analysis is a generalization of Fisher’s linear discriminant.
Both Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) are linear transformation techniques that are commonly used for dimensionality reduction.
PCA can be described as an “unsupervised” algorithm, since it “ignores” class labels and its goal is to find the directions (the so-called principal components) that maximize the variance in a dataset.
In contrast to PCA, LDA is “supervised” and computes the directions (“linear discriminants”) that will represent the axes that that maximize the separation between multiple classes.
How LDA Works?
LDA utilizes Fisher linear discriminant method to separate the classes.
Fisher’s linear discriminant is a classification method that projects high-dimensional data onto one — dimensional space and performs classification in this one — dimensional space.
The projection maximizes the distance between the means of the classes while minimizing the variance within each class.
- Classes: 1, 2 and 3
- Mean of Classes: µ1, µ2 and µ3
- Scatter Between Classes: SB1, SB2 and SB3
- Scatter Within Classes: SW1, SW2 and SW3
- Dataset Mean: µ
The idea is to Maximize the SBs, Scatter Between Classes and Minimize the SWs, Scatter Within Classes.
Formula
Motivation
- Find a direction that will amplify the inter-class difference.
- Maximize (squared) difference between the projected means
- Minimize the projected scatter within each class
Scatter
Mean Difference
Scatter Difference
Fischer Index
What it means is while selecting the eigenvalues we will always select C-1 Eigenvalues and corresponding Eigenvectors.
Example
Dataset
Step 1: Compute Within Class Scatter matrix (SW)
Find covariance matrix for each class
Class 1
Mean Matrix
Covariance
Adding S1 to S5 we will get Sc1
Class 2
Mean matrix
Similarly like Sc1, adding S6 to S10, we will get covariance Sc2 -
Adding Sc1 and Sc2 will give us Sw, Within-Class Scatter Matrix
Step 2: Compute Between Class Scatter matrix (SB)
We already have mean for each features for Class 1 and Class 2
Step 3: Find the best LDA projection vector
Similar to PCA we find this using Eigenvector with largest Eigenvalue to find the best projection vector. The Eigenvector can be represented in below form.
We already have SW and SB
Solving for lambda we get, lambda = 15.65, highest value. Now, solve the correspondent vector for each values of lambda
Step 4: Transforming the samples onto the new subspace.
So using LDA we have transformed
I hope this article provides you with a basic understanding of Linear Discriminant Analysis.
If you have any questions or if you find anything misrepresented please let me know.
Thanks!