Machine Learning 101 — Evaluation Metrics for Regression

Dhruv Kapoor
The Startup
Published in
6 min readJul 14, 2020

One of the most important steps in the Machine Learning process is the evaluation of our model. Choosing the correct metric plays a large role in determining the algorithm we choose and the features we select for our model. Moreover, the metric we choose will help us to interpret the performance of our model and allow us to explain our results to other people. In this blog post, we’ll cover a few of the most popular metrics used for regression.

Photo by Isaac Smith on Unsplash

In my previous blog post, I explained how we utilize the Ordinary Least Squares (OLS) method to implement a Linear Regression model. If you haven’t done so earlier, then do check it out below!

Evaluation Metrics for Regression

The Scikit-learn library in Python offers us a plethora of metrics to choose from as seen below:

A list of regression metrics available in the Scikit-learn library

In this blog post we’ll focus on some of the most widely used metrics:

  1. Root Mean Squared Error (RMSE) — It is one of the most popular metrics used for regression. Here, we subtract the predicted value from the actual value and then square this difference for each row. After this, we add all these errors together, divide by the total number of data points, and find the square root of this total. If we don’t take the square root then this metric is referred to as Mean Squared Error (MSE). Mathematically, we get:

where

  • yᵢ is the actual value
  • yᵢ-hat is the predicted value
  • N is the total number of data points

Some points to remember:

  • The RMSE value will always be ≥ 0 as we square the differences
  • The smaller the RMSE value the better our model, i.e. a good model will have an RMSE value closer to 0
  • As we square the differences between actual and predicted values it puts more focus on heavier errors, highlighting its vulnerability to outliers
  • It is easy to optimize due to its squared attribute which makes it differentiable and easy to exploit using gradient descent algorithms

2. Mean Absolute Error(MAE) — In this metric, we subtract the predicted value from the actual value for each row and take the absolute of this difference to keep a positive value. We then divide the sum of all these differences by the total number of data points. It is calculated as follows:

Some points to remember:

  • The MAE value is always ≥ 0 as we take the absolute difference
  • The smaller our MAE value the better our model, i.e. it should be as close to 0 as possible
  • Errors are analogously weighted, i.e. an error of 2 is twice as bad as an error of 1
  • Vulnerable to outliers (but less vulnerable than RMSE)
  • It is not as easy to optimize as compared to RMSE

3. R² (R-squared) — Often referred to as the coefficient of determination, this metric tells us how well our set of input features explain the variance in the dependent variable. It is usually the first metric most of us learn when we perform regression as it is comparatively simple to interpret. Mathematically, the R² value is calculated as follows:

The SSE (Sum of Squared Errors) gives us the error obtained from our best fit line. Mathematically, we obtain the following:

If we look closely we observe that this is similar to the Mean Squared Error (MSE) metric without the 1/N factor. Similarly, we calculate SST (Total Sum of Squares) as follows:

where

  • y-bar is the mean of the dependent (or target) variable

Some points to remember:

  • The value of R² is usually between 0 and 1 but a negative value is produced if our model is worse than just using the mean of the dependent variable
  • The higher the R² value the better our model
  • It makes comparisons between different models easier and more consistent

Although the R² value is by far one of the most popular metrics a major drawback is that it assumes that each independent variable contributes positively to our model. Therefore, even if we keep adding useless variables to our model our R² value will always increase which can be misleading.

In other words, if our model uses x independent variables to make predictions and we obtain an R² value of 0.8, and we train another model with x+n independent variables (where n≥1) then the R² value of our second model will always be greater than 0.8.

To overcome this issue, another metric called Adjusted R² is often utilized. Instead of assuming that each independent variable explains some variation in the dependent variable, the Adjusted R² value tells us the percentage of variation explained by only the independent variables that actually affect the dependent variable.

where

  • N is the total number of data points
  • p is the number of regressors (independent variables) in our model

The Adjusted R² actually penalizes our model for adding useless independent variables so its value will always be less than or equal to R². In most cases though, the R² value is sufficient to judge the performance of our model as we try to add only useful independent variables to create our regression model.

Conclusion

In this blog post we covered some of the most popular metrics for regression such as Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and R² (R-squared). We also discussed certain characteristics of each metric which can help us to interpret our results.

While it is important to remember that there is no perfect metric, knowing how these metrics work, understanding the differences between them and comparing the pros and cons of each metric can help us choose the correct metric for our purposes. Knowing these caveats will allow us to distinguish various models from one another and also help us in creating better models.

Thanks for reading and stay tuned for more!

Resources:

  1. https://www.h2o.ai/blog/regression-metrics-guide/
  2. Stephanie Glen. “Adjusted R2 / Adjusted R-Squared: What is it used for?” From StatisticsHowTo.com: Elementary Statistics for the rest of us! https://www.statisticshowto.com/adjusted-r2/

--

--

Dhruv Kapoor
The Startup

Data Science. Machine Learning. Always eager to learn.