Confusion Matrix

4 min readNov 16, 2022

There are multiple ways to measure performance of a classification model e.g. accuracy. Accuracy however is not always a best way to measure performance as it is prone to outliers. Also, accuracy is not fit for problems where both positive and negative outcomes are valuable such as in health care industry.

Confusion Matrix

Confusion Matrix is a tabular representation of the actual and predicted value in a classification model. It works for both binary and multiclass classification. This is the most popular method used to evaluate logistic regression and recommender system.

Confusion Matrix is a summary of actual vs predicted result of a classification problem.

Confusion Matrix — Binary Classification

Precision

Precision is the ratio of true positives to the total of the true positives and false positives. To understand it better think of it like how good the model is at predicting a specific category.

Good choice of metric when you care a lot about false positives, goal is to minimize false positives, e.g., medical screening, drug testing

Precision

Recall or Sensitivity

A Recall is essentially the ratio of true positives to all the positives in ground truth. To understand it better think of it like how many times the model was able to detect a specific category.

Good choice of metric when you care a lot about false negatives, goal is to minimize false negatives, e.g., fraud detection

Recall

F1 Score

It is the harmonic mean of Precision and Recall. When precision and recall both seem to be important then it is useful to use this metric because it is representative of both of them.

F1 Score

Accuracy

Classification accuracy is perhaps the simplest metric to use and implement and is defined as the number of correct predictions divided by the total number of predictions, multiplied by 100.

Accuracy

Other Measurements

We can derive various other measurements from the confusion matrix.

Multi - Class Classification

Uses one vs rest methodology to calculate different metrics.

Here is an example of how a confusion matrix will look like in case of multi — class classification.

Confusion Matrix — Multi — Class Classification

Calculating precision and recall is simple for a binary classification. For a multiclass classification we take an average for each class.

Precision — Multi — Class Classification

Type 1 & Type 2 Error

Type 1

A Type 1 error is also known as a false positive and occurs when a researcher incorrectly rejects a true null hypothesis. This means that your report that your findings are significant when in fact they have occurred by chance.

Type 2

A Type II error is also known as a false negative and occurs when a researcher fails to reject a null hypothesis which is really false or incorrectly rejects the alternate hypothesis. Here a researcher concludes there is not a significant effect, when actually there really is.