Why Is the Curse of Dimensionality Important in Machine Learning

Learn the surprising effects that occur in high-dimensional space

Rukshan Pramoditha
4 min readMay 27, 2024
Image generated by the author using DALL-E

In machine learning, the number of features (variables) in a dataset is called the dimensionality of data.

When the number of features increases (this is very common in real-world datasets), the feature space becomes high-dimensional in which our brain struggles to realize visualization of data beyond 3-dimensional space!

The curse of dimensionality

After overfitting, the next big problem in ML is the curse of dimensionality.

When the dimension of data is high, machine learning models face many challenges. The Curse of Dimensionality (coined by Richard Bellman in 1961) refers to the challenges and surprising effects that occur in high-dimensional space.

Challenges occurred in high-dimensional spaces

Some common challenges include:

  • A higher number of features will make training extremely slow.
  • It will be hard to find a good solution for an ML model that has so many features.
  • The model will become more complex with a high number of features.

--

--

Rukshan Pramoditha

3,000,000+ Views | BSc in Stats | Top 50 Data Science, AI/ML Technical Writer on Medium | Data Science Masterclass: https://datasciencemasterclass.substack.com