Bias Variance Trade-off

Nikhil Upadhyay
Knowledge Gurukul
Published in
3 min readMar 9, 2021

What is Bias and variance in ML we will understand in this Publication.

First of all Bias Variance is fundamental technology that is use for Model Performance, Model Evaluation, means how accurate our model is performing on given data set.

What is Bias ?

Bias is the accuracy of our predictions.

How far our prediction value of Data set from the real value of the Data set or we can say the difference between actual data point to predicted data point. High Bias always leads high error on both training and testing data set.

Bias of 2 type :

  1. High Bias (due to, too much difference in real and predict data set high bias always leads high error)
  2. Low Bias (difference between real and predict is not too much and give good result)

What is Variance ?

Model fit training data set very well that most of every data point is cover of data-set but when we introduce a new data set then its prediction is very poor or we can say the amount or the value of variance if high then it will change the value of prediction. high variance indicates that the data points are very spread out from the mean, and from one another.

Variance is 2 Type :

  1. High Variance (if Model Pay more attention on training data and give very bad prediction on new data or testing data then this is high variance condition)
  2. Low Variance (if a mid point chooses for which data are not High variance and give good prediction on variance)

To understand More clearly Bias variance take Bull’s eye example :

  1. Low Bias Low Variance
  2. Low Bias High variance
  3. high Bias Low Variance
  4. High Bias High Variance
Bias-variance

In the above diagram, centre of the target is a model that perfectly predicts correct values. As we move away from the bulls-eye our predictions become get worse and worse

  1. Low Bias Low Variance:

Ideal Situation for our ML model, So in this diagram if the data point are in centre where condition is Low Bias and Low Variance then it will give good prediction.

Conduct no error or very less error.

2. Low Bias High variance (Over-fitting) :

Over fitting occurs when our ML model tries to cover all the data points or more than the required data points present in the given data set. Because of this, the model starts caching noise and inaccurate values present in the data set, and all these factors reduce the efficiency and accuracy of the model. The over fitted model has low bias and high variance.

The chances of occurrence of Over fitting increase as much we provide training to our model. It means the more we train our model, the more chances of occurring the over fitted model.

3. High Bias and low variance (Under-fitting) :

This situation occur when we have less amount of data for build accurate model.

4. High Bias and High Variance:

High Bias and High Variance is the worst case for our model because due to high bias not give good prediction and due to high variance training data is fitted very well but poor prediction in new test data set.

How to avoid the Over-fitting in Model

Both over fitting and under-fitting cause the degraded performance of the machine learning model. But the main cause is over-fitting, so there are some ways by which we can reduce the occurrence of over-fitting in our model.

  • Cross-Validation
  • Training with more data
  • Removing features
  • Early stopping the training
  • Regularization
  • Ensembling

How to avoid underfitting:

  • By increasing the training time of the model.
  • By increasing the number of features.

--

--

Nikhil Upadhyay
Knowledge Gurukul

Experience in AI(Computer Vision), Machine Learning, Python, Data Science and Proficient in Data Analysis, Predictive modelling, NLP, Database(SQL,