TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

How t-SNE Outperforms PCA in Dimensionality Reduction

15 min readMay 23, 2023

--

Photo by Pat Whelen on Unsplash

In machine learning, dimensionality reduction refers to reducing the number of input variables in the dataset. The number of input variables refers to the dimensionality of the dataset.

Dimensionality reduction techniques are mainly divided into two main categories: Linear and Non-linear (Manifold).

Under linear methods, we have discussed Principal Component Analysis (PCA), Factor Analysis (FA), Linear Discriminant Analysis (LDA) and Non-Negative Matrix Factorization (NMF).

Under non-linear methods, we have discussed Autoencoders (AEs) and Kernel PCA.

t-Distributed Stochastic Neighbor Embedding (t-SNE) is also a non-linear dimensionality reduction method used for visualizing high-dimensional data in a lower-dimensional space to find important clusters or groups in the data.

All dimensionality reduction techniques fall under the category of unsupervised machine learning in which we can reveal hidden patterns and important relationships in the data without requiring labels.

So, dimensionality reduction algorithms deal with unlabeled data. When training such algorithms, the…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Rukshan Pramoditha
Rukshan Pramoditha

Written by Rukshan Pramoditha

3,000,000+ Views | BSc in Stats (University of Colombo, Sri Lanka) | Top 50 Data Science, AI/ML Technical Writer on Medium

No responses yet