Bias & Variance in Machine Learning

This blog aims to explain the fundamentals of Machine Learning, which are Bias & Variance. These are the terms whose understanding is very much important for future learning in this field because everything in the machine learning field depends on this only.

Harshit Dawar
The Startup
4 min readOct 26, 2020

--

Source

I am pretty much sure, that most of you guys might have heard about these terms, but you are confused in these terms as they are also available in multiple other fields like statistics.

How these terms are different in Machine Learning & what is their role in Machine Learning? If you have these questions, or even if you have not heard these terms at all, don’t worry, this blog will guide you to the exact meaning of these terms with practical & diagrammatic explanation.

Introduction to Bias & Variance!

Both of these terms are elaborated in the general visualization of a graph. Let’s take an example of a very easy Machine Learning algorithm i.e. Linear Regression, whenever a graph of Linear Regression is plotted, both of these terms Bias & Variance are observed, & depending on them the general ability of the Model to predict in future is identified.

Both of the terms are opposite to each other generally, it might be possible that many of you have tried to understand these terms, but you might feel that they are confusing or difficult. You need not worry, this blog will clear all doubts regarding this topic & will explain them in isolation, rather than explaining both of them in one go.

Both of the terms are explained below individually with graphical representation.

Bias

It is a term which refers to the extent of the model of predicting much far away than the actual value.

For example, let’s say the actual value for a quantity is 100, but the model predicts it to be 35. It is clear that the model does not learn the patterns from the data, & that is why it is not able to predict the correct value(not even closer to the correct value).

Because of the inability of the model to capture the patterns in the data, it predicts much far away from the actual value. This behaviour arises due to a very large bias. This is the exact meaning of the Underfitting of the Model. Therefore, by understanding Bias in depth, it is clearly depicted that if Bias is very high, than there will be underfitting.

A graphical representation of high bias is shown below:

Image by Author!

The above graph represents the inability to capture the patterns in the data which results in a very large difference(very large bias) between the actual & predicted values.

Variance

This term is completely opposite to the Bias. It refers to the ability of the model to only capture all of the points in the dataset, & have very high flexibility to capture only the data points, but zero flexibility to work with the data points which are not seen till now, or the data points on which the model is not trained.

For example, if the actual values are [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], & model also captures the results for only these values [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]. In addition to that, if I provide any value other than the above listed, then the model will not be able to provide the correct result.

Image by Author!

The above image illustrates the behavior of the model, in which it is just learning the exact data points, rather than observing the pattern. This refers to a very high variance, which leads to the overfitting of the model.

When the variance is very high, the Model loses its ability to generalize!

What should be the ideal values of Bias & Variance!

The ideal values of the bias & variance are low, i.e. the best model can be created if it has low bias & low variance.

If a model has low bias & low variance, then the model will have the ability to generalize & observe the pattern in the dataset.

I hope my article explains each and everything related to the topic with all the deep concepts and explanations. Thank you so much for investing your time in reading my blog & boosting your knowledge. If you like my work, then I request you to give an applaud to this blog!

--

--

Harshit Dawar
The Startup

AIOPS Engineer, have a demonstrated history of delivering large and complex projects. 14x Globally Certified. Rare & authentic content publisher.