This is article is meant to give a practical demonstration of Machine Learning with a small data-set. For a basic explanation of MAE, do check my other article on Mean Absolute Error ~ MAE in Machine Learning(ML).
In this article, I will give a working example of how to calculate the Mean Absolute Error using a model that predicts cost price of houses with different sizes.
The calculation in this article is explicitly based on maths/statistics. There are other programming libraries (Especially for python: NumPy, Scikit-learn, etc…) which equally implements this and can be imported for the same function.
This prediction error is calculated for each record of the test data set. After, we convert each error to a positive figure if negative. This is achieved by taking Absolute value for each error. Finally we calculate the mean value for all recorded absolute errors. (Average sum of all absolute errors).
Actual Costs - assumed actual cost of houses in this example
2 bedroom — $200K
3 bedroom — $300K
4 bedroom — $400K
5 bedroom — $500K
Predicted Costs - assumed predicted cost of houses in this example
2 bedroom — $230K
3 bedroom — $290K
4 bedroom — $740K
5 bedroom — $450K
Mean Average ERROR
What exactly does ‘ERROR’ in this metric mean ?
Prediction Error => Actual Value - Predicted Value
In this case our error for each prediction can be calculated as below;
2 bedroom house
Actual Price = $200K
Predicted Price = $230K
Error => Actual Price — Predicted Price
Absolute Error 1 = |Error| (Absolute or positive value of our error)
3 bedroom house
Actual Price = $300K
Predicted Price = $290K
Error => Actual Price — Predicted Price
Absolute Error 2= |Error| (Absolute or positive value of our error)
4 bedroom house
Actual Price = $400K
Predicted Price = $740K
Error => Actual Price — Predicted Price
Absolute Error 3= |Error| (Absolute or positive value of our error)
5 bedroom house
Actual Price = $500K
Predicted Price = $450K
Error => Actual Price — Predicted Price
Absolute Error 4= |Error| (Absolute or positive value of our error)
Let n be the total number of training set
n == 4
MAE = (Absolute Error 1 + Absolute Error 2 + Absolute Error 3 + Absolute Error 4) / n
MAE = ($30K + $10K + $340K + $50K)/4
MAE = $107.5K
This is our measure of model quality. We are therefore able to say that, averagely, our model predictions are off by approximately $107.5K
Also in a coming articles I will give an explanation of other metrics for verifying accuracy of our model such as Root mean squared error (RMSE). I will also compare their advantages, disadvantages and similarities and show working examples.s