Understanding Bias and Variance in Machine Learning

Irina (Xinli) Yu, Ph.D.
5 min readJul 1, 2024
Photo by GC Libraries Creative Tech Lab on Unsplash

In the realm of machine learning, the concepts of bias and variance are fundamental to understanding the performance of predictive models. These concepts are crucial when it comes to building models that generalize well to unseen data. In this article, we will dive deeply into the notions of bias and variance, explore the bias-variance trade-off, and discuss strategies to balance them effectively.

What is Bias?

Bias refers to the error introduced by approximating a real-world problem, which may be complex, by a simplified model. In simpler terms, bias is the degree to which a model’s predictions are systematically off from the actual values. High bias can cause the model to miss relevant relations between features and target outputs, leading to underfitting.

Characteristics of High Bias:

  • The model is too simple.
  • It fails to capture the underlying trend in the data.
  • Predictions are consistently inaccurate.

Example of High Bias:

Consider a linear regression model trying to fit a non-linear data set. The model, being too simple to capture the non-linearity, will produce biased predictions that consistently miss the actual trend in the data.

--

--