Top 20 Machine Learning and Deep Learning Mistakes That Secretly Happen Behind the Scenes

And how to avoid them

Rukshan Pramoditha
Data Science 365
Published in
16 min readOct 12, 2022

--

Image by Kerstin Riemer from Pixabay (slightly edited by the author)

Learning from mistakes is the key to success in any field.

That’s why we should analyze the mistakes once identified.

Mistakes are often made by beginners in machine learning and deep learning. In some cases, mistakes can happen secretly so that we’re not even aware of them. That’s the dangerous part!

Most of my articles published so far were designed to address the most common mistakes in machine learning and deep learning.

Today’s article explains the top 20 mistakes (in my opinion) that most people make or happen secretly when building machine learning and deep learning models.

At the end of this article, you’ll be able to get comprehensive knowledge and solutions for most of the questions you have in machine learning and deep learning.

Let’s get started!

1. Multicollinearity

When the input features (variables) are highly correlated with the other features in the dataset, it is known as multicollinearity which negatively affects the performance of ML models.

In most cases, people don’t know whether their models suffer from multicollinearity as this happens behind the scenes!

How to identify multicollinearity

The most effective way of identifying multicollinearity is to create a heatmap by visualizing the correlation coefficients of input features.

An example of a heatmap (Image by author)

The above image shows a heatmap of 30 input features. By looking at the color of tiny squares, we can decide whether a pair of input features is correlated or not. Light colors show a high correlation and dark colors show a low correlation.

How to remove multicollinearity

The easiest way to remove multicollinearity is to apply PCA to the input features before making the model. PCA acts as a data preprocessing step…

--

--

Rukshan Pramoditha
Data Science 365

2,000,000+ Views | BSc in Stats | Top 50 Data Science, AI/ML Technical Writer on Medium | Data Science Masterclass: https://datasciencemasterclass.substack.com/

Recommended from Medium

Lists

See more recommendations