Evaluation Metrics for Regression Algorithms (Along with their implementation in Python)

Venu Gopal Kadamba
Analytics Vidhya
Published in
7 min readNov 27, 2020

This article focuses on the evaluation metrics that are used to evaluate a Regression Algorithm along with their implementation in Python. At the end of this article you will get familiar with evaluation metrics for regression algorithms along with their implementation in python.

Evaluation of a model is the most important part to build an effective Machine Learning Model. Before diving into the topic, let us understand what is a regression algorithm.

What is a Regression Algorithm?

Regression algorithms fall under Supervised Machine Learning Algorithms. Regression algorithms predicts a continuous value based on input features. For Example: House price prediction based on features of a house (number of bedrooms, house size, location, age of the house, year of renovation).

What are Evaluation Metrics?

Evaluation Metrics are used to measure the quality of a Machine Learning algorithm. There are many evaluation metrics present for different types of algorithms. We will be discussing about the evaluation metrics for Regression.

Evaluation Metrics for Machine Learning Regression Algorithms:

  1. Mean Absolute Error
  2. Mean Square Error
  3. Root Mean Square Error
  4. R² Score
  5. Adjusted R² Score

Mean Absolute Error (MAE)

Mean Absolute Error is the average of the sum of absolute difference between the actual values and the predicted values. Mean Absolute Error is not sensitive to outliers. MAE should be used when you are solving a regression problem and don’t want outliers to play a big role in the prediction. It can be useful if you know that the distribution of data is multimodal.

Let us look into the formula for Mean Absolute Error:

Mean Absolute Error Formula

Let us breakdown the formula:

yᵢ = predicted value

yᵢ hat = actual value

Here (yᵢ - yᵢhat) is the error value and absolute value of the error is taken to remove any negative signs.

Let us look into the implementation part of Mean Absolute Error using python. Let X_train, y_train be the train data and X_test, y_test be the test data to evaluate our model.

A model with less MAE performs better than model with large MAE value.

# Importing all necessary libraries
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
# Initializing the model and fitting the model with train data
model = LinearRegression()
model.fit(X_train,y_train)
# Generating predictions over test data
predictions = model.predict(X_test)
# Evaluating the model using MAE Evaluation Metric
print(mean_absolute_error(y_test, predictions))

Mean Square Error (MSE)

Mean Square Error is the average of the sum of square of the difference between actual and predicted values.

MSE is most useful when the dataset contains outliers, or unexpected values (too high or too low values). So, it should be taken into consideration that if our model makes a single very bad prediction MSE magnifies the error.

MSE is least useful when a single bad prediction would ruin the entire model’s predicting abilities, i.e. when the dataset contains a lot of noise.

The MSE has the units squared of whatever is plotted on vertical axis or y-axis. Since square of the error is taken in the function.

A large MSE value means that the data values are dispersed widely around the mean of the data and a small MSE value means that the data values are closely dispersed around the mean. i.e. A model with small MSE value has better performance.

Let us look into the formula for Mean Square Error:

Mean Squared Error Formula

The squaring of the error (yᵢ — yᵢhat) is necessary to remove any negative signs and also gives more weight to large differences.

Let us look into the implementation part of Mean Square Error using python. Let X_train, y_train be the train data and X_test, y_test be the test data to evaluate our model.

# Importing all necessary libraries
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
# Defining our own MSE function
def own_mean_squared_error(actual, predictions):
return ((predictions - actual) ** 2).mean()
# Initializing the model and fitting the model with train data
model = RandomForestRegressor(
n_estimators = 100,
criterion = 'mse'
)
model.fit(X_train,y_train)
# Generating predictions over test data
predictions = model.predict(X_test)
# Evaluating the model using MSE Evaluation Metric
print(mean_squared_error(y_test, predictions))
print(own_mean_squared_error(y_test, predictions))

Root Mean Square Error (RMSE)

Root Mean Square Error is same as Mean Square Error but root of the MSE is considered while evaluating the model. RMSE is more sensitive to the presence of false data .i.e. outliers. RMSE is most useful when large errors are present and they drastically effect the model performance. Since, RMSE assigns a higher weights to large errors.

RMSE is a frequently used evaluation metric for evaluating a model. Unlike MSE, Root Mean Square Error has the same unit of quantity plotted on vertical axis or y-axis. Since square root of the MSE value is taken in RMSE.

Let us look into the formula for Root Mean Square Error:

Root Mean Square Error Formula

There is no inbuilt function available to calculate Root Mean Square Error. Let us look into the implementation part of Root Mean Square Error by defining our own function. Let X_train, y_train be the train data and X_test, y_test be the test data to evaluate our model.

# Importing all necessary libraries
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
# Defining RMSE function
def root_mean_squared_error(actual, predictions):
return np.sqrt(mean_squared_error(actual, predictions))
# Initializing the model and fitting the model with train data
model = RandomForestRegressor(
n_estimators = 100,
criterion = 'mse'
)
model.fit(X_train,y_train)
# Generating predictions over test data
predictions = model.predict(X_test)
# Evaluating the model using RMSE Evaluation Metric
print(root_mean_squared_error(y_test, predictions))

Note: For sklearn version ≥ 0.22.0, sklearn.metrics has a mean_squared_error function with a squared kwarg (default value is True). Setting squared value to False will return RMSE value.

# For sklearn versions >= 0.22.0
print(mean_squared_error(y_test, predictions, squared = False))

R² Score

R² score also known as the coefficient of determination gives the measure of how good a model fits to a given dataset. It indicates how closer are the predicted values to the actual values.

Let us look into the formula for better understanding:

R² Formula

Let’s breakdown the formula and look into each term:

SSᵣₑₛ = Sum of Square of Residuals

SSₜₒₜ = Total Sum of Squares

The R² value ranges from -∞ to 1. A model with negative R² value indicates that the best fit line is performing worse than the average fit line.

Let us look into the implementation for R² evaluation metric:

# Importing all necessary libraries
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
# Initializing the model and fitting the model with train data
model = LinearRegression()
model.fit(X_train,y_train)
# Generating predictions over test data
predictions = model.predict(X_test)
# Evaluating the model using R² Evaluation Metric
print(r2_score(y_test, predictions))

The major drawback of the R² metric is that, as the number of input features for the model increases the R² value also increases irrespective of the significance of the added feature with respect to output variable. i.e. even if the added feature has no correlation with the output variable, the R² value increases.

Adjusted R² Score

Adjusted R² is a modified form of R² that penalizes the addition of new independent variable or predictor and only increases if the new independent variable or predictor enhances the model performance.

Let us look into the formula for Adjusted R²:

Adjusted R² Formula

Let us breakdown the formula and look into its each term:

R² : It is R² Score

n : Number of Samples in our Dataset

k : Number of Predictors

There is no inbuilt function to calculate only adjusted R². Let us look into the implementation part of Adjusted R²:

# Importing all necessary libraries
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
# Defining the adjusted R² function
def adjusted_r2_score(actual, predictions, num_pred, num_samples):
n = num_samples
k = num_pred
r2 = r2_score(actual, predictions)
adjusted_r2 = 1 - ((1-r2) * ((n-1)/(n-k-1)))
return adjusted_r2
# Initializing the model and fitting the model with train data
model = LinearRegression()
model.fit(X_train,y_train)
# Generating predictions over test data
predictions = model.predict(X_test)
# Evaluating the model using Adjusted R² Evaluation Metric
num_samples = X_test.shape[0]
num_predictors = X_test.shape[1]
adjusted_r2_score(y_test, predictions, num_predictors, num_samples)

Note: Adjusted R² will be always less than or equal to R² Score.

The above mentioned evaluation metrics are 5 most commonly used Evaluation Metrics for evaluating Regression Algorithms.

If you liked this article please follow me. If you noticed any mistakes in the formulas, code or in the content, please let me know.

You can find me at : LinkedIn, GitHub

Thank You!

--

--