Overfitting, underfitting, and bias-variance tradeoff are foundational concepts in machine learning. They are important because they explain the state of a model based on their performance. The best way to understand these terms is to see them as a tradeoff between the bias and the variance of the model. Let's understand the phenomenon of overfitting and underfitting.
Overfitting occurs when a statistical model or machine learning algorithm captures the noise of the data. Intuitively, overfitting occurs when the model or the algorithm fits the data too well. Specifically, overfitting occurs if the model or algorithm shows low bias but high variance. Overfitting is often a result of an excessively complicated model, and it can be prevented by fitting multiple models and using validation or cross-validation to compare their predictive accuracies on test data.
Underfitting occurs when a statistical model or machine learning algorithm cannot capture the underlying trend of the data. Intuitively, underfitting occurs when the model or the algorithm does not fit the data well enough. Specifically, underfitting occurs if the model or algorithm shows low variance but high bias. Underfitting is often a result of an excessively simple model.
Both overfitting and underfitting lead to poor predictions on new data sets.
Well, let's understand the Bias and variance in simpler terms. (Very Simpler Terms!)
What is Bias?
Bias is the difference between the average prediction of our model and the correct value which we are trying to predict. A model with high bias pays very little attention to the training data and oversimplifies the model.
Simple definition: “Resulted Error from Training Data!”
What is a Variance?
Variance is the variability of model prediction for a given data point or a value that tells us the spread of our data. A model with high variance pays a lot of attention to training data and does not generalize on the data which it hasn’t seen before.
Simple definition: “Resulted Error from Test Data!”
Well, to understand the concepts more clear and better, I have divided concepts into Two parts, Bias and variance in the case of Regression as well as Classification models.
Considering Regression models:
We can see clearly that the Model-1 and Model-3 are Underfitting and Overfitting respectively.
Model-1 has not captured the trends properly, or the model is too simple, hence it's obvious that the training and test accuracy will be hampered!
As we discussed earlier, “Bias is Error resulted from Training set, while Variance is error resulted from Test set!”. The Model-1 will have less train and test accuracy, I.e. Will have High Bias(High Training error) and High Variance(High Testing error).
Similarly, for Model-3, The model has trained too good on training data, the reason it fails for testing data(Low test accuracy). Since the training accuracy for Model-3 is High and Test accuracy is low, Model-3 will have Low Bias( Low Training error) and High Variance(High Testing error).
Considering Model-2, As the Model-2 is in the “Just Right” condition, the model has trained well on training as well as a test set respectively. The reason, model has High training accuracy (Low Bias-low training error) and High testing accuracy( Low Variance-low testing error).
Now, Let’s consider the condition for Classification models, Please have a look at the explained image below!
Here we have 3 Models, which have the following training and testing errors.
As we can see our Classification Model-1 has a Low Training error(2%), while has a high testing error(18%). As explained the concepts earlier we can conclude the model is having Low Bias(Low training error) and High Variance(High testing error), i.e. the Model-1 is clearly Overfitting!
Similarly, We can conclude our Classification Model-2 as clearly an underfitting model. Coming towards Model-3, This model shall be considered as the Most Generalized or Most Recommended model to train on!
Well, this was the explanation for Underfitting, Overfitting, Bais, and Variance for Regression and Classification Models respectively!
We are done with the explanation part, now let's have a look at the graphical plotting of these concepts. Please have a look at the figure below!
Considering Figure 3, the dotted line which passes through the points, are the points for which we should design our model, Which would be the “Most Generalized Model”.
This was all from my side! If you find this Blog helpful do Like(Clap!) this Blog, also comment on your views, If I missed any point. Because these reviews help me to grow and bring better content next time!
Also, Connect with me on LinkedIn! ( I love to connect with amazing people like you! ).
Gaurav Sahani - Data Analyst Intern - Neubrain Solutions Pvt Ltd | LinkedIn
Machine Learning, Deep Learning, and Cloud Enthusiast. Like to explore new things, enhancing and expanding my knowledge…
Also, Follow and check-out GitHub for my work and project contributions!
GauravSahani1417 - Overview
Learning new things every day! Learning new things every day! Machine Learning, Deep Learning, and Cloud Enthusiast…
Thank you for your precious time given for my blog!