Top 20 Machine Learning and Deep Learning Mistakes That Secretly Happen Behind the Scenes
And how to avoid them
--
Learning from mistakes is the key to success in any field.
That’s why we should analyze the mistakes once identified.
Mistakes are often made by beginners in machine learning and deep learning. In some cases, mistakes can happen secretly so that we’re not even aware of them. That’s the dangerous part!
Most of my articles published so far were designed to address the most common mistakes in machine learning and deep learning.
Today’s article explains the top 20 mistakes (in my opinion) that most people make or happen secretly when building machine learning and deep learning models.
At the end of this article, you’ll be able to get comprehensive knowledge and solutions for most of the questions you have in machine learning and deep learning.
Let’s get started!
1. Multicollinearity
When the input features (variables) are highly correlated with the other features in the dataset, it is known as multicollinearity which negatively affects the performance of ML models.
In most cases, people don’t know whether their models suffer from multicollinearity as this happens behind the scenes!
How to identify multicollinearity
The most effective way of identifying multicollinearity is to create a heatmap by visualizing the correlation coefficients of input features.
The above image shows a heatmap of 30 input features. By looking at the color of tiny squares, we can decide whether a pair of input features is correlated or not. Light colors show a high correlation and dark colors show a low correlation.
How to remove multicollinearity
The easiest way to remove multicollinearity is to apply PCA to the input features before making the model. PCA acts as a data preprocessing step…