“How Bias-Variance, Model Complexity & Underfitting-Overfitting are related?”

Agasti Kishor Dukare
The Code Monster
Published in
4 min readJan 18, 2021

In this article, I’ll explain how Prediction Errors are related to ML model complexity and thus contribute to model performance in the most simple language.

Photo by Nikola Đuza on Unsplash

Whether you are a newbie in ML space or have years of experience, you must have observed that whenever you search for Overfitting on the web you’ll see either of the following one-liners:

Overfitting refers to a model that models the training data too well.

or

Overfitting models have high variance and low bias.

These definitions suffice if one’s goal is just to prepare for the exam or clear the interview. But if you are like me, who wants to understand things from basics, these surface-level explanations will not be enough.

Why High Variance Low Bias Model is Overfitting and where does model complexity come into the picture? How does “Model Paying Too much attention to training data” translates to high variance and thus into overfitting?

I needed more explanation. Hence, I dug deep so now you don’t have to! Here’s my take on the relation between Overfitting-Underfitting & Bias-Variance.

Bias & Variance:

In any ML Model, when we make predictions the model will always have some amount of error. We call it Prediction Error.

Note: While the first two errors i.e. Bias & Variance are controllable, we have no control over the Error due to noise.

Let’s say you have a ML model and you run it on the training dataset. Now if you take the average of the residuals (difference of prediction & true value) & subtract it from the average of true values, you’ll get Bias. Thus Bias is an indicator of how far off the model predictions as a whole are from the true values.

Now let’s say you took 5 datasets that are different from each other & trained a single ML model on each of them. For each dataset, you took the difference between the randomly selected prediction & the average of all the predictions. Observe how this difference changes for different datasets. This is the Variance.

Thus, we can see from these 2 definitions that, Bias tells on average, how accurate our model is in terms of making correct predictions while Variance tells how consistent our model is.

For any ML model, our goal is to create a model that is consistent & has high accuracy i.e. low Bias & low Variance.

Bias-Variance & Model Complexity:

The high Bias Model has high inaccuracy in terms of predictions. This means that the model fails to capture patterns & relationships in data. This happens because the model cannot learn from data well. Thus the model is not complex enough.

High Variance Model has low consistency in terms of its predictions. This means that its predictions vary drastically with a change in the dataset which in turn is an indicator that the model fails to generalize well. This happens when the model captures the underlying pattern as well as noise in data. This is the sign of the model being high in complexity.

Underfitting-Overfitting:

Because High Bias models are not complex enough, they fail to capture the underlying patterns in data and thus result in high training & test error which is a sign of Underfitting. Thus it is evident that Underfitting is the result of a Model that is not complex enough i.e. High in Bias.

Conversely, a model with High Variance is high in complexity. Thus while learning the underlying patterns in data, it also learns outliers & noise. Therefore, it gives high accuracy on the familiar dataset i.e. training data. But when a slightly different dataset i.e. test data is presented to this model, its prediction varies considerably which results in poor accuracy. This is a sign of Overfitting. So, it’s observed that Overfitting is the result of a Model that is high in complexity i.e. High in Variance.

Bias-Variance Tradeoff:

As mentioned before, our goal is to have a model that is low in Bias & low in Variance. To avoid High Bias, we will need to increase the complexity of our model & similarly, to avoid High Variance, we will need to decrease the complexity of our model.

To achieve a low bias-low variance model, we need to create a model that will have low complexity & high complexity simultaneously which is impossible! This is known as Bias-Variance Tradeoff. Hence, you will observe that a Model that is Underfitting i.e. High in Bias is always low in Variance & vice versa.

There you have it! This is my understanding of how Bias-Variance, Model Complexity & Underfitting-Overfitting are interconnected. I hope you guys found it helpful.

If you have any queries or find any mistakes in concepts please feel free to write in the comments. Thank you!

--

--