Bias-Variance Tradeoff Explained!

Vishwa Pardeshi
4 min readJun 25, 2020

--

Graphical Illustration of Bias-Variance Source

Bias-Variance tradeoff is a key concept in the Machine Learning Domain which helps in model assessment. The performance of a model on unseen data is a true test of its predictive capability. Thus, prediction error which mathematically can be decomposed into bias and variance captures the model’s generalizability. However, before we jump headfirst into Bias-Variance tradeoff, let’s revisit a few ML terminologies and concepts.

The performance of a model on unseen data is a true test of its predictive capability.

Back to Basics

The objective of any machine learning algorithm is to build a mathematical model f which trains on finite test data to learn the data’s behavior. These models then take unseen data input to either

  • Predict an output : Prediction
  • Infer relationship among variables : Inference

Depending on our objective — prediction or inference, we make conscious tradeoffs between model interpretability and flexibility.

Model Interpretability vs Model Flexibility

A model’s interpretability refers to how easily can the ML model be viewed not as a mysterious black box but an easy insight window into the relationships between the various predictor variables and the response variable in the dataset.

Figure 1: The inflexible model on the left is highly interpretable and helps quantify the relationships between variables. However, not all the data points lie on the straight plane which reduces accuracy. On the other hand, the right model is more flexible (high complexity) and fits the data point closely but difficult to interpret. Source

On the other hand, model flexibility also known as model complexity captures how flexible or unrestrictive the model is (Figure 1; right model). More flexible models naturally would fit more closely to the data (hint: leads to overfitting). For example, a linear model is less flexible than a polynomial model. However, the polynomial model is highly complex as higher degrees and complicated non-linear relationships between predictor and response variables are difficult to interpret.

This tradeoff between model interpretability and flexibility forms the basis for further discussion.

Bias-Variance Tradeoff

What is Bias?

Bias refers to the error introduced due to assumptions made to model a relatively complex real-life problem by an approximate simple model.

It is worth noting how the models which are flexible by their nature will make fewer assumptions about the nature of the data unlike inflexible model say linear regression which assumes a linear relationship between the predictor and the response variable is linear. This introduces bias in the model as the relationship might not be strictly linear which negatively impacts the accuracy of the estimates using linear regression.

More flexible models which make less assumption about the data result in less bias. However, this might lead to underfitting.

What is Variance?

Variance refers to the amount by which the estimates or results of the machine learning model changes with a different training dataset. As machine learning models learn from training data, a change in the training data is bound to result in a change in the model. However, if this change is too much for even small changes in the training data, then the model suffers from high variance.

More flexible models have higher variance. This leads to overfitting.

Why is Bias-Variance Tradeoff important?

Once a final model has been selected, it is important to assess the model. Model assessment refers to estimating its prediction error on unseen data. This prediction error is also known as the test error. A good machine learning model minimizes the test error i.e. improves its generalizablility.

This test error contains irreducible and reducible error. The expected test error can be represented as

The above is referred to as the Bias-Variance Decomposition and can be explored here. This Bias-Variance Decomposition uses Mean Squared Error to calculate Test Error.

While the irreducible error is due to the noise in the data and can only be eliminated by cleaning up the data, reducible error is introduced due to bias and variance. From the equation, it is clear that we want models which have low bias and low variance.

Why is there a tradeoff?

Since bias and variance change in opposite direction when flexibility of the model changes, there exists a tradeoff. Simply put, if bias increase, variance decreases and vice versa.

If the complexity/flexibility of models increases

  • Bias initially decreases faster than variance.
  • After a point, variance increases significantly with an increase in flexibility with little impact on bias.
Figure 2: Bias-variance and Total Error Source

The challenge lies in finding a machine learning model with optimum complexity for which both — bias and variance — are low.

Ways to reduce Bias & Variance

We know how bias and variance are introduced in our models and its impact on test error, we can use this understanding to minimize the expected test error.

Bias can be reduced by increasing the complexity of the model by doing the following:

  1. Boosting
  2. Add more features or introduce polynomial features

Variance can be reduced by using the following techniques:

  1. Bagging
  2. Constraining or shrinking estimated coefficients by regularization

--

--

Vishwa Pardeshi

Data Science | Machine Learning | Software Development | Feel free to connect @ http://linkedin.com/in/vishwaPardeshi