Confused about Confusion Matrix?

Akshada Gaonkar
Analytics Vidhya
Published in
4 min readAug 27, 2020
Photo by Soraya Irving on Unsplash

Confusion matrix is an N*N matrix (N = number of target classes) consisting a count of predicted and actual target values. It helps in evaluating the performance of a classification model.

Let’s take a look at a confusion matrix for binary classification to understand the terminology used.

source: author

Whether a mail is spam or not, if a patient is diabetic or not, if a customer will churn out or not — these are some examples of binary classification. Say our classification model tells us if the animal is a dog or a cat. The null hypothesis or positive statement for this example being “Dog” and the negative statement “Cat”.

source: author
  • True Positive: Positive case correctly predicted as positive, i.e. a dog is predicted as ‘dog’.
  • True Negative: Negative case correctly predicted as negative, i.e. a cat is predicted as ‘cat’.
  • False Positive: if a negative case is predicted as positive, i.e. a dog is predicted as ‘cat’. This is Type I error.
  • False Negative: if a positive case is predicted as negative, i.e. a cat is predicted as ‘dog’. This is Type II error.

Now let’s see how these values from the confusion matrix help us evaluate a classifier’s performance.

Accuracy
How often does the classification model predict correctly?
80% accuracy means out of 10 cases, 8 are correctly predicted.

Precision
Whenever the classifier predicts positive, how often is it correct?
Precision value of 70% will mean that if 10 cases are predicted to be dogs, 3 out of them are actually cats.

Recall/ Sensitivity
It is the True Positive Rate, i.e. it answers the question-
Out of all the actual positive cases, how many are correctly predicted as positive by the classifier?
90% recall means that out of 10 dog cases, we missed 1 dog.

Specificity
It is the True Negative Rate, i.e. it answers the question-
Out of all the actual negative cases, how many are correctly predicted as negative by the classifier?
60% specificity means that we missed 4 out of 10 cats and wrongly classified them as dogs.

Let’s talk about a usual question — What is the difference between Precision and Recall? or How to decide which one of the two to use?

Precision is used when False Positive is a higher concern then False Negatives.
For example, a music recommendation system, incorrect results can upset customers and lead them to churn out.

Recall on the other hand, is used in cases where False Positives aren’t really harmful but False Negatives are. For example, the classifier tells us whether a patient is diabetic or not. If a patient is falsely classified as diabetic (FP), other tests performed can solve this problem. But imagine if a diabetic patient is classified as non-diabetic (FN), this will cause a problem as the patient will go unattended.

In some cases we can’t really figure out which one of Precision or Recall is more important and so it’s best to combine them both. And so we use F-score.

F-Score/ F1-Score
It is the harmonic mean of recall and precision.
Value of F1-score tending to 1 is considered to be the best and tending to 0 to be worst. It can be useful in classifications in which True Negatives don’t matter much.

AUC-ROC
ROC curve is Receiver Operator Characteristic curve. AUC-ROC is Area Under the ROC curve. ROC is a True Positive Rate (TPR) vs True Negative Rate (TNR) graph with TPR on y-axis and FPR on x-axis. It gives an overall performance measurement of the classifier.
Greater the area under the ROC curve better is the model at distinguishing between the classes.

source: https://www.datasciencecentral.com/profiles/blogs/roc-curve-explained-in-one-picture

AUC near 1 — classifier is good at distinguishing between the classes, i.e. both positive and negative cases will be predicted correctly.
AUC approx 0.5 — worst case as the classifier is incapable of distinguishing between classes at all.
AUC near 0 — classifier will have a reciprocated effect, i.e. all positives will be predicted as negatives and vice-versa.

--

--

Akshada Gaonkar
Analytics Vidhya

Intern at SAS • MTech Student at NMIMS • Data Science Enthusiast!