Linear Discriminant Analysis (LDA)

Linear Discriminant Analysis (LDA)

Gajendra
5 min readJan 16, 2023

Linear discriminant analysis (LDA) is a dimensionality reduction technique in which the goal is to project a dataset into lower dimensional space. Linear discriminant analysis is also known as Normal Discriminant Analysis (NDA), or Discriminant Function Analysis is a generalization of Fisher’s linear discriminant.

Both Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) are linear transformation techniques that are commonly used for dimensionality reduction.

PCA can be described as an “unsupervised” algorithm, since it “ignores” class labels and its goal is to find the directions (the so-called principal components) that maximize the variance in a dataset.

In contrast to PCA, LDA is “supervised” and computes the directions (“linear discriminants”) that will represent the axes that that maximize the separation between multiple classes.

Before and After LDA

How LDA Works?

LDA utilizes Fisher linear discriminant method to separate the classes.

Fisher’s linear discriminant is a classification method that projects high-dimensional data onto one — dimensional space and performs classification in this one — dimensional space.

The projection maximizes the distance between the means of the classes while minimizing the variance within each class.

  • Classes: 1, 2 and 3
  • Mean of Classes: µ1, µ2 and µ3
  • Scatter Between Classes: SB1, SB2 and SB3
  • Scatter Within Classes: SW1, SW2 and SW3
  • Dataset Mean: µ

The idea is to Maximize the SBs, Scatter Between Classes and Minimize the SWs, Scatter Within Classes.

Formula

Motivation

  • Find a direction that will amplify the inter-class difference.
  • Maximize (squared) difference between the projected means
  • Minimize the projected scatter within each class

Scatter

Scatter

Mean Difference

Mean Difference

Scatter Difference

Scatter Difference

Fischer Index

Fischer Index

What it means is while selecting the eigenvalues we will always select C-1 Eigenvalues and corresponding Eigenvectors.

Example

Dataset

Step 1: Compute Within Class Scatter matrix (SW)

Find covariance matrix for each class

Class 1

Class 1

Mean Matrix

Mean Matrix

Covariance

Covariance

Adding S1 to S5 we will get Sc1

Class 2

Class 2

Mean matrix

Mean matrix

Similarly like Sc1, adding S6 to S10, we will get covariance Sc2 -

Adding Sc1 and Sc2 will give us Sw, Within-Class Scatter Matrix

Step 2: Compute Between Class Scatter matrix (SB)

We already have mean for each features for Class 1 and Class 2

Step 3: Find the best LDA projection vector

Similar to PCA we find this using Eigenvector with largest Eigenvalue to find the best projection vector. The Eigenvector can be represented in below form.

We already have SW and SB

Solving for lambda we get, lambda = 15.65, highest value. Now, solve the correspondent vector for each values of lambda

Step 4: Transforming the samples onto the new subspace.

So using LDA we have transformed

Transformation

I hope this article provides you with a basic understanding of Linear Discriminant Analysis.

If you have any questions or if you find anything misrepresented please let me know.

Thanks!

--

--

Gajendra

| AWS MLS, SAA, CLF | MIT - ADSP | Software Engineer | Data Scientist | Machine Learning | Artificial Intelligence | Hobby Blogger |