Sitemap
Analytics Vidhya

Analytics Vidhya is a community of Generative AI and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Bias — Variance TradeOff & Regularization

--

What is Bias?

If machine learning model is performing very badly on a set of data because it is not generalizing to all your data points, This is when you say your model has high bias and the model is said to underfit.

  • The Error between average model prediction and ground truth
  • The Bias of the estimated function tells us the capacity of the underlying model to predict the values

What is Variance?

If Machine Learning model tries to account for all or mostly all points in a dataset successfully. If it then performs poorly when run on other test data sets, it is said to have high variance and the model is said to overfit.

  • Average Variability in the model prediction for the given dataset
  • The Variance of the estimated function tells you how much the function can adjust to the change in the dataset

High Bias

  • Overly-Simplified Model
  • Under-Fitting
  • High error on both test and train data

High Variance

  • Overly-complex model
  • Overly-Fitting
  • Low error on train data
  • High error on test data
  • Starts modeling the noise in the input

Bias Variance Trade-Off

  • Increasing bias reduces variance and vice-versa
  • Error = Bias² + Variance + irreducible error
  • The best model is where the error is reduced.
  • Compromise between bias and variance.
Press enter or click to view image in full size

Regularization

The regression method used to tackle high variance is called regularization.

We try to minimize the error (cost function), Observe that the cost function was dependent on the coefficients

In such cases, the primary objective is to minimize error. There is no restriction on how small or large the coefficients can be, to achieve this objective. But in real life, we need to achieve objectives with some restrictions imposed.

  • For Example, We need to minimize the cost function in Linear regression but with some constraints on coefficient values. This is because too high values of coefficients may be unreliable both for explanation and prediction as they lead to overfitting.
  • Hence, to the cost function, we add these constraints that the sum of the squared coefficients values or sum of absolute values of coefficients. If the sum is more, then the cost function value increases, and hence that cannot be the optimal solution.
  • The Optimal solution will be the one where the sum of the coefficients (or coefficients squared) will be minimum.
  • The equations can be defined as.
Press enter or click to view image in full size
Ridge Regression

The Above Equation is known as Ridge Regression and instead of m² if we have a modulus of m, then it’s called Lasso Regression

Press enter or click to view image in full size
Lasso Regression

Practically, the factor λ decides the extent of penalization. Observe that if λ=0, then there is no regularization (it’s the same as the original loss function).

Loss function

Loss Function: Mean Squared Error

Loss function

If lambda is very high, there is a high penalization on the values of coefficients that they are small.

In the case of lasso regression, the coefficient of the variables can be made 0 hence this can be used as a feature selection model.

In the case of ridge regression, the coefficients can be made near zero but not zero.

The whole idea of regularization is to reduce overfitting. It is an observation that high coefficients (generally with no regularization) values may not generalize the data and it may lead to overfitting.

At the same time, too low coefficient values (obtained with high values of lambda) may not give the complete picture and hence the model may not perform well on a train as well as test. This is an Underfit.

The lambda value needs to be appropriately chosen such that the problem of overfit/underfit is lessened.

--

--

Analytics Vidhya
Analytics Vidhya

Published in Analytics Vidhya

Analytics Vidhya is a community of Generative AI and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Nadeem
Nadeem

Written by Nadeem

Data Science Consultant | AI Researcher

No responses yet