Error Metrics used in Time Series Forecasting Modeling

Sai Varun Immidi
Analytics Vidhya
Published in
5 min readAug 8, 2020
Analyzing the time series forecasting model performance

Error metrics are very useful in assessing model performance. These Error metrics can serve two purposes broadly: one is using these error metrics we can get to know how well are our predictions correct and the second is using these metrics we can compare different models can then consider the best model among all. Obviously, the lower the error more accurate the predictions made by the model about the future data points or simply the forecasts.

Let’s look at some of the common error metrics used in particular with Time Series Forecasting model assessment and they are:

· Mean Frequency Error

· Mean Absolute Error

· Mean Absolute Percentage Error

· Mean Square Error

· Root Mean Square Error

Let’s go through each of the error metrics in detail.

  1. Mean Frequency Error (MFE): Also, called Frequency Bias. This metric is generally considered to be a naive metric as we just consider the difference between actual values and predicted values and then take their mean. Doing so, it cancels out the positives and negatives which have been created upon under or over predictions from actual values. Hence it is considered to be a naive metric. Using this metric, we will get to know on a whole whether the predictions made are underestimated or overestimated. Resulting in a positive value of MFE is an indication that our predictions are lower than actual values on a whole. Similarly, resulting in negative value is an indication that our predictions are higher than actual values as a whole. Using this metric, we will not be knowing anything about the deviation present between actual and predicted values. This drives to consider another better metric called MAE.
Mean Frequency Error

Where:
* y actual refers to the actual time-stamped data points of the target variable.
* y hat forecast refers to the predictions made by the time series forecasting model for the target variable.
Note: These definitions of y actual and y hat forecast apply to all other formulas as well.

2) Mean Absolute Error (MAE): This metric helps us to overcome the shortcoming caused by MFE. Rather than considering the mean of differences between actual and predicted values we consider absolute values of the error terms and then taking mean of them. Doing so, the positives and negatives will not get cancel with each other. Using this metric, we will get to know on an average how much are the predictions deviated from actual values. At the same time, there exists small nuance in understanding the metric, in order to make any conclusion by estimating this metric we need to take into consideration the scale of actual values. Suppose the scale of the actual values in single digits and MAE value turns out to be 1.5 in such cases having such deviation in predictions would be unacceptable. Alternatively, if the scale of the actual values is in thousands and MAE value upon calculations turns to be the same 1.5 in this case the predictions made are considered to be a good one as the MAE value is very small. Hence the scale of actual values needs to be taken into consideration upon evaluation. To overcome this shortcoming, we consider in terms of percentages of actual values doing so it drives us to another new metric called MAPE.

Mean Absolute Error

3) Mean Absolute Percentage Error (MAPE): This metric is considered for a better understanding of evaluation. This is achieved by considering percentages of absolute error terms with actual values. Doing so, we no need to consider the scale of actual values. Since MAPE is in percentage upon subtracting from 100 we will get to know about the accuracy of our predictions. It is just a simpler version of MAE.

Mean Absolute Percentage Error

4) Mean Square Error (MSE): MSE and MAE are based on the same ideology, avoiding the cancellations of positives and negatives in error terms when summed together. In this, we consider the square of the error terms and then take their mean. Generally, MSE is used as an error function in the regression setting which is then optimized to obtain optimum model parameters. Many times, MSE would be the better choice of error function than MAE as error function because MAE is not differentiable at its minima value as a result, we do not get optimum model parameters considering MAE as error functions. Though there exist some positives and negatives as well in each. Again, it all boils down to a different topic of Error functions. Let’s not deviate further. There is small nuance in considering MSE as it is since we are considering the square of the error terms the units of MSE are different from actual values. To have an evaluation on the same scale we move on to our final metric in our post. In a nutshell, MSE also tells us on an average how which are the predictions deviated from the actual values.

Mean Square Error

5) Root Mean Square Error (RMSE): To be on small scale as that of the actual values we consider the root of MSE.

Root Mean Square Error

These are some of the commonly used error metrics in Times Series Forecasting modeling. Some of the metrics are even used in other machine learning modeling like in the Regression setting.

Finally, we have come to an end in this post !!

References:

upGrad learning platform.
Image source: upGrad Learning platform: Time Series Forecasting -1
Header image source: relexsolutions.com

--

--