Classification Metrics 101

Published in

Analytics Vidhya

6 min readJan 28, 2020

Let’s talk about classification metrics by first introducing our confusion matrix.

Here is a typical sample confusion matrix that I created. Note that the numbers don’t mean anything, they will just be for the example.

Confusion matrices typically follow this type of layout where you have a 2 x 2 matrix with the columns being your predicted positives and negatives and the rows being your actual positives and negatives. Keep in mind that these are regarding our Y variable.

There are numerous metrics, but we’re going to explore more about the surface level to keep it friendly for those who may not have a degree in Mathematics and studied Linear Algebra and Abstract Math, like me. All we’re going to do is work with our confusion matrix and introduce some aliases and formulas.

In our confusion matrix, we have four numbers: 10, 20, 30, and 40. Remember these numbers mean nothing, but when you build a confusion matrix on your own they may represent your actual Y values versus your predicted Y values from a model you created. If we go back to when we were learning our times table for the first time, you put one finger on the top and one finger on the left. Go in a straight line with both until your fingers meet (at least that’s how I was taught).

In our example case:

10 = Predicted Positive and Actual Positive
30 = Predicted Positive and Actual Negative
20 = Predicted Negative and Actual Positive
40 = Predicted Negative and Actual Negative

Now, it seems like a lot to refer to each of these numbers as their row and column label, so let’s introduce some new names!

10 = Predicted Positive and Actual Positive = True Positive
30 = Predicted Positive and Actual Negative = False Positive (Type 1 Error)
20 = Predicted Negative and Actual Positive = False Negative (Type 2 Error)
40 = Predicted Negative and Actual Negative = True Negative

As you can see, we have two types of errors: Type 1 and Type 2. Nothing too special about it as it’s just something that was assigned and something to just memorize. One little trick is to think of when you are writing out “p” for False Positive. (This may not work for you, if you don’t write a “p” like I do.) When I write a “p” I draw a vertical line first and then finish it off with a backwards “c” on the top half of my line. Similarly, I write my “1”s as a vertical line. So I remember it as if I were to write out “False Positive,” I think about writing a 1 for Type 1 when I get to the “p.” Then by process of elimination the False Negative would be Type 2.

One more alias! Let’s just shorten the new names we got:

10 = Predicted Positive and Actual Positive = True Positive = TP
30 = Predicted Positive and Actual Negative = False Positive = FP
20 = Predicted Negative and Actual Positive = False Negative = FN
40 = Predicted Negative and Actual Negative = True Negative = TN

Here we took our sample confusion matrix and just added in our new labels.

Now, the fun part: metrics! We’re going to discuss five basic metrics: accuracy, misclassification, sensitivity, specificity, and precision. Overall, it’s going to be memorizing formulas, but hopefully my tips will help you memorize them.

Accuracy = Correct Predictions / All Predictions

If we refer back to our confusion matrix and plug in what we know, we get the following:

Essentially, accuracy gives us an idea of how well our model is doing as it gives us the proportion of our correctly predicted values out of all of our predictions. In this case, we are 50% accurate.

Misclassification = 1 — Accuracy

Misclassification is relatively simple. It’s the opposite of accuracy so misclassification gives us an idea of how off our model is doing as it gives us the proportion of our incorrectly predicted values out of all of our predictions. In this case, we misclassified 50% of our predictions.

Sensitivity = True Positives / All Positives

Sensitivity, also known as True Positive Rate or Recall, tells us the proportion of our model’s ability to predict true positives over total actual positives. Even with our new abbreviations, it can get confusing on remembering what to calculate so here’s a helpful tip!

If you spell out “sensitivity” there are no “p”s in the word, but the formula involves all our “p”s, or “positives.” We need our true positives and our false negatives. Now you’re probably thinking “false negative” doesn’t have a “p” and you’re right. However, if you think back to basic math, we can associate false as negative and negative as, well, negative and we know that a negative and a negative make a positive!

Specificity = True Negatives / All Negatives

Specificity, also known as True Negative Rate, tells us the proportion of our model’s ability to predict true negatives over total actual negatives. Similarly to Sensitivity, here’s a helpful tool for Specificity!

If you spell out “specificity” there is a “p” in the word, but the formula involves none of our “p”s, or “positives.” We need our true negatives and our false positives. Now you’re probably thinking “false positive” does have a “p” and you’re right. However, it’s the same trick as above. We can associate false as negative and positive as, well, positive and we know that a negative and a positive make a negative!

Precision = True Positives / Predicted Positives

Following accuracy, precision is the proportion of correctly predicted positive values over all our predicted positive values, which is our percentage of correct positive predictions. Precision is also known as Positive Predictive Value and we can remember this as precision starts with “p” so I need all positives: true positives and false positives.

Whew, that’s all the formulas! Now what do we do with all this information? Well, if you ever find yourself using a classified model and wanting to see how well it’s doing, you can create a confusion matrix to see it.

In the image above, I wanted to see how well my model was doing after training and fitting it with existing data. In this case, I was using a logistic regression model, log_reg, and used the .predict() method to generate Y values based on my X variables. Afterwards, I compared my actual Y values against my predicted Y values into a confusion matrix, which yields an array. No extra work on my end thanks to Python libraries! The last code block is converting my confusion matrix array into a DataFrame to make it easier to read and more appealing to the eyes. With my four numbers from my confusion matrix, I can easily calculate my classification metrics, but from looking at the matrix by itself, I can see that my model does quite well where I only have two Type 2 errors where all my other predictions are accurate.