What is the Curse of Dimensionality?

Dr. Roi Yehoshua
AI Made Simple
Published in
6 min readApr 8, 2023

--

The curse of dimensionality refers to various problems that arise when working with high-dimensional data. In this article we will discuss these problems and how they affect machine learning algorithms, and then suggest a few ways to deal with them.

Living in a High-Dimensional Space

We are so used to live in three dimensions, that our intuition fails when we try to imagine things in high-dimensional spaces. Many things behave very differently in high-dimensional spaces.

For example, if we pick a random point in a unit square (a square whose side length is 1), its probability of being located less than 0.01 from the border of the square is less than 2%. This is because the area of the unit square is 1, whereas the area of the inner square with side length 0.99 is 0.99 × 0.99 = 0.9801. Therefore, the probability that the point falls inside the area between the inner square and the unit square is 1–0.9801 = 0.0199.

On the other hand, if we pick a random point in a unit hypercube with 1,000 dimensions, its probability of being located less than 0.01 from the border of the hypercube is greater than 99.99%!

The reason is that the volume of the unit hypercube is 1¹⁰⁰⁰ = 1, while the volume of the inner hypercube with side length of 0.99 is 0.99¹⁰⁰⁰ = 0.00004317. Therefore, the probability that the point falls inside the volume between the inner hypercube and the unit hypercube is 1–0.00004317 = 0.99995683.

--

--

Dr. Roi Yehoshua
AI Made Simple

Teaching Professor for Data Science and ML at Northeastern University | Top Writer in AI | 200K+ Views on Medium | https://www.linkedin.com/in/roi-yehoshua/