Linear Discriminant Analysis (LDA)

5 min readJan 16, 2023

Linear discriminant analysis (LDA) is a dimensionality reduction technique in which the goal is to project a dataset into lower dimensional space. Linear discriminant analysis is also known as Normal Discriminant Analysis (NDA), or Discriminant Function Analysis is a generalization of Fisher’s linear discriminant.

Both Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) are linear transformation techniques that are commonly used for dimensionality reduction.

PCA can be described as an “unsupervised” algorithm, since it “ignores” class labels and its goal is to find the directions (the so-called principal components) that maximize the variance in a dataset.

In contrast to PCA, LDA is “supervised” and computes the directions (“linear discriminants”) that will represent the axes that that maximize the separation between multiple classes.

How LDA Works?

LDA utilizes Fisher linear discriminant method to separate the classes.

Fisher’s linear discriminant is a classification method that projects high-dimensional data onto one — dimensional space and performs classification in this one — dimensional space.

The projection maximizes the distance between the means of the classes while minimizing the variance within each class.

Classes: 1, 2 and 3
Mean of Classes: µ1, µ2 and µ3
Scatter Between Classes: SB1, SB2 and SB3
Scatter Within Classes: SW1, SW2 and SW3
Dataset Mean: µ

The idea is to Maximize the SBs, Scatter Between Classes and Minimize the SWs, Scatter Within Classes.

Formula

Motivation

Find a direction that will amplify the inter-class difference.
Maximize (squared) difference between the projected means

Minimize the projected scatter within each class

Scatter

Mean Difference

Scatter Difference

Fischer Index

What it means is while selecting the eigenvalues we will always select C-1 Eigenvalues and corresponding Eigenvectors.

Example

Dataset

Step 1: Compute Within Class Scatter matrix (SW)

Find covariance matrix for each class

Class 1

Mean Matrix

Covariance

Adding S1 to S5 we will get Sc1

Class 2

Mean matrix

Similarly like Sc1, adding S6 to S10, we will get covariance Sc2 -

Adding Sc1 and Sc2 will give us Sw, Within-Class Scatter Matrix

Step 2: Compute Between Class Scatter matrix (SB)

We already have mean for each features for Class 1 and Class 2

Step 3: Find the best LDA projection vector

Similar to PCA we find this using Eigenvector with largest Eigenvalue to find the best projection vector. The Eigenvector can be represented in below form.