Simplifying The Confusion Matrix

Knowing that the machine learning field is very vast and it has various concepts to understand in it, a very rare and unique concept of statistical classification problem comes in the ideology i.e. a confusion matrix, also known as the error matrix.

This article aims in understanding the confusion matrix in a very simple manner.

Let’s try and understand the confusion matrix on a very basic level.

Confusion matrix is used to summarize, describe or evaluate the performance of a Binary classification task or model.

The Key concept of confusion matrix is that it calculates the no. of correct & incorrect predictions which is further summarized with the no. of count values and breakdown into each classes.

It eventually shows the path in which classification model is confused when it makes predictions.

The Pure Definition of Confusion Matrix is:

A confusion matrix is a table that outlines different predictions and test results and contrasts them with real-world values. Confusion matrices are used in statistics, data mining, machine learning models and other artificial intelligence (AI) applications. A confusion matrix can also be called an error matrix.

By Margaret Rouse..

Confusion matrices are used to make the in-depth analysis of statistical data faster and the results easier to read through clear data visualization.

Below is a simple example of a confusion matrix:

Here, it comes with 2 rows & 2 columns.

Consisting of,

True Positives, True Negatives

False Positives, False Negatives.

Here we have kept the predictions as rows and actual values as columns

Diving little deeper in each of the terms:
• Positive (P) : Actual is positive (for example: is an apple).
• Negative (N): Actual is not positive (for example: is not an apple).
• True Positive (TP): Actual is positive, and is predicted to be positive.
• False Negative (FN): Actual is positive, but is predicted negative.
• True Negative (TN): Actual is negative, and is predicted to be negative.
• False Positive (FP): Actual is negative, but is predicted positive.

Let’s understand it with an example of HIV Test:

Seeing the above diagram we can say

• True Positives (TP): We tested for Positive (Will Have) & they actual have the disease.
• True Negatives (TN): We tested for Negative (Will Not Have) & they actual don’t have the disease.
• False Positives (FP): We tested for Positive (Will Have) & they actual don’t have the disease. (Also known as a “Type I error.”)
• False Negatives (FN): We tested for Negative (Will Not Have) & they actual have the disease. (Also known as a “Type II error.”)

Now let’s take a numerical example of dataschool.io and understand the confusion matrix and the list of rates that are calculated from a confusion matrix for a binary classifier

Few Other terms to be know:

High recall, low precision: Indicates that most of the positive examples are correctly recognized (low FN) but there are a lot of false positives.

Low recall, high precision: Indicates that we miss a lot of positive examples (high FN) but those we predict as positive are indeed positive (low FP).

F-score

It is the Harmonic mean of the two values which we have i.e. Precision and Recall.

It considers both the Precision and Recall of the procedure to compute the score.

Higher the F-score, the better will be the predictive power of the classification procedure.

A score of 1 means the classification procedure is perfect. Lowest possible F-score is 0.

Python Code implementation Example: