Evaluation Metrics explained the way they should be

5 min readFeb 22, 2023

This article is for evaluation metrics for a classification model only

source: https://en.wikipedia.org/wiki/Confusion_matrix

I know whenever you look at this image, you always get confused. Otherwise, why would you be here?

So, in this article, I will try to pass on the intuition that I developed for all these metrics. The following topics will be covered in this article:

TP, TN, FP, FN, TPR, FPR, TNR, FNR, Precision, Recall, Sensitivity, and Specificity.

btw

TP stands for True Positive
TN — True Negative
FP — False Positive
FN — False Negative

and we will get to know the others as we move along further in this article.

So first things first, anything that you write in terms of the TP, TN, FP, and FN has a basic structure. The second part states the label you predicted, if it is P, this implies that you predicted Positive and if it is N then you predicted Negative for that particular record. The first letter indicates whether the predicted label is correct or not. For eg. TP implies that the positive label that you predicted was correct i.e. True or in the case of FN, this implies that your model predicted the label to be Negative but it is incorrect in actuality i.e. it is False in reality label is positive.

Keep in mind that we are doing all of this to look at the quality of the model’s output, and to do this we need to have the actual labels along with the predicted labels.

So, I hope that the above things make much more sense now.

Now, let’s look at all the other things i.e. TPR, FPR…..etc

TPR (True Positive Rate) — TPR explains that of all the actual positive labels (True Positive and False Negative), how many did your model predict right (True Positive). It can also be seen in the formula

Recall and Sensitivity are also the same things, but just for remembering things, try to understand the literal meaning of these terms (this holds for the intuition behind any term but we tend to forget this simple thing).

Recall — Imagine that you are sitting for a test, and all the questions of the test have only two options (True & False). So, the score that you get in this test is Recall, i.e. of all the correct answers, how many were you able to remember.

Sensitivity — Similarly sensitivity implies your model’s sensitivity to the positive labels.

Note: Whenever we talk about all these terms, the general notion is to talk about the positive labels. For eg. when we were talking about Recall, we were concerned with the recall of positive labels, sensitivity was also in terms of positive labels only.

FPR (False Positive Rate) — FPR means that of all the actual negative labels (True Negative and False Positive), how many did the model get wrong (False Positive).

A simple trick to remember is whenever we are talking about these ‘rates’ we calculate the ‘rate’ with respect to the actual label. Eg. for FPR, the actual label is Negative (FP implies that your model predicted a Positive label but it is incorrect and in reality it was Negative). So, for FPR we will calculate the rate with respect to the actual negative labels. From this you can easily derive the formula, i.e. keep the number of all the actual ‘negative’ or ‘positive’ labels in the denominator and the thing for which you are calculating the label in the numerator. So for FPR it will be the ratio of FP and all the Negatives.

TNR (True Negative Rate) — TNR implies out of all the actual negative labels (True Negative and False Positive), how many did the model get right (True Negative).

Again if you are getting confused, just remember to focus on the literal meaning of the word. For eg. here it is a True Negative Rate, so we are concerned with the rate at which we are predicting the True Negatives, and also that we calculate the rate wrt to actual labels.

FNR (False Negative Rate) — Here we are calculating FNR, so the actual label is Positive and therefore the rate will be calculated with respect to positive labels hence the definition goes — FNR implies out of all the actual positive labels (True Positive and False Negative), how many did the model predict incorrectly (False Negative).

Precision — This simply implies, how precise our model is, and since it is the general notion to represent these parameters with respect to positive labels. Therefore the formula of precision goes like this — Precision is out of all the positive labels our model predicted (i.e. True Positive and False Positive) how many were correct (i.e. True Positive).

Thanks for reading till the end, and I hope you will get a better understanding of these evaluation metrics now.

Evaluation Metrics explained the way they should be

Written by Lakshyasoni