Metrics to evaluate for machine learning algorithms: Part 1

Published in

Knowledge Lens: A Rockwell Automation Company

5 min readJun 24, 2022

image by Evaluation Metrics in Machine Learning

The people who are building a model by using machine learning algorithms. so many of them don’t know for which algorithm which metric is used to calculate the performance of the model.

Now here we can see how to solve the above problem.

The performance of the model can be calculated by using different types of evaluation metrics like

Classification Accuracy
Confusion Matrix
Logarithmic Loss or( log loss)

Before going to know about metrics we should know about balanced data and imbalanced data.

What is balanced data?

You have a data with thousand of points

negative(-ve) data points = 520

positive(+ve) data points = 480

negative(-ve) data points are similarly equal to positive(+ve) data points not exactly these type of data is balanced data.

What is imbalanced data?

You have a data with thousand points

negative(-ve)data points = 900

positive(+ve) points = 100

negative(-ve) data points >> positive(+ve) data points

(or)

negative(-ve)data points = 100

positive(+ve) points = 900

negative(-ve) data points << positive(+ve) data points these types of data are imbalanced data.

CLASSIFICATION ACCURACY

The percentage of correctly categorized points in test data divided by the total number of points is called accuracy.

Classification accuracy = number of correctly classified points/total number of points in test data.

Accuracy lies between 0 to 1.
If the accuracy is 0 the performance of the model is very bad and if the accuracy is 1 the performance of the model is very good.

Accuracy falls in two cases

Imbalanced data.
If the model returns the probability of the label.

Before going to know about how accuracy fails in imbalanced data we should know about the “dumb model”.

what is the dumb model?

suppose I have a data set

Dn=1000(total dataset),

Dtrian data=900 only negative,

Dtest data=100 both negative and positive,

Now train the model using train data it learns only negative points and returns only negative points for any query point, this type of model is called a dumb model.

How accuracy fails in imbalanced data

Imagine I have a ‘dumb model’(M1) which returns negative for any query point.

You have a test data like Dtestdata=90 negative(-ve) points and 10 positive(+ve) points it is imbalanced data, Now run the model(M1) it is the dumb model on Dtestdata the accuracy is 90% because the Dtestdata itself contains 90% of negative points, because of imbalanced data the dumb model(M1) also get high accuracy so it means the accuracy is not a very useful method to measure the performance of the model when the data is imbalanced.

How accuracy fails if the model returns the probability of the label

Imagine you have two models but you don’t know which model is good.

Observe the figure Y1' is the predicted class label of Model1 and Y2' is the predicted class label Model2, here the accuracy is measured using class labels only, after calculating the accuracy we conclude both the models are good but, seeing the probabilities of class labels we conclude that Model1 is better than Model2. Both conclusions are different, so if the model returns the probability of class label the accuracy is not better to measure the performance of the model.

CONFUSION MATRIX

As its name suggests, the confusion matrix provides us with a matrix as well as the full model performance.

Assume that the issue is one of binary classification. We have some samples that fall into one of two categories: YES or NO. Additionally, we have a classifier that we developed that predicts a class based on an input sample. The following result is obtained after running our model on 165 samples.

There are four keywords

TP: The instances when we predicted “YES” and the actual result “YES”
TN: Cases where we expected NO and the actual output matched our prediction
FP: Cases where we expected NO and the actual output matched our prediction
FN: The instances when we predicted “YES” and the actual result “YES”

Here T=True,F=false,P=positive,N=negative and R=rate.

TPR=TP/P

TNR=TN/N

FPR=FP/N

FNR=FN/P

By seeing the above four rates we can say how the model is performing in balanced data and in imbalanced data.

If TPR and TNP are large and, If the FPR and FNR are small then the model is good.
If TPR and TNP are small and, If the FPR and FNR are large then the model is bad.

By averaging the numbers along the “major diagonal,” the accuracy of the matrix can be determined. i.e.

NOTE: By using the confusion matrix to measure the performance of the model to balanced data and to measure the performance of the model to imbalanced data also.

LOGARITHMIC LOSS

Logarithmic Loss metrics are used when the model predicts the probability of the class label.

Suppose you have a binary classification model which predicts the probability of class label by using the below formula we can find the Logarithmic Loss.

If the logos are small the model of the performance is better and, if the logos are very high the model of the performance is very bad.

It works well for multi-class-classification also, the formula for multi -class-classification is

SUMMARY

If the model returns the class labels and the data is balanced data then we go for accuracy.
If the model returns the class labels the data is balanced or imbalanced we can go for a confusion matrix.
If the model returns the probability of the class label we can go for logos.

If you have reached this far, thank you for reading this blog, I hope you found it insightful 😃. Give us a follow for more content on technology, productivity, work habits, and more!