Confusion matrix and its interpretation

PRAVESH GREWAL
Python’s Gurus
Published in
4 min readJun 12, 2024

A confusion matrix is a matrix that summarizes the performance of a machine learning model on a set of test data. It is a means of displaying the number of accurate and inaccurate instances based on the model’s predictions. It is often used to measure the performance of classification models, which aim to predict a categorical label for each input instance.

The matrix displays the number of instances produced by the model on the test data.

  • True positives (TP): occur when the model accurately predicts a positive data point.
  • True negatives (TN): occur when the model accurately predicts a negative data point.
  • False positives (FP): occur when the model predicts a positive data point incorrectly.
  • False negatives (FN): occur when the model mispredicts a negative data point.

we have two value in confusion matric one is predicted and one is real output.

how can we calculate confusion matrix?

we have two values in a model one is y and y is real value or actual value. y^ is the predicted value.

when we get actual and predicted values both 1 it’s means our output is True Positive.

when we get actual and predicted values both 0 it’s means our output is True Negative.

when we get the actual is 1 and the predicted value is 0 it’s means our output is False Negative.

when we get the actual is 0 and the predicted value is 1 it’s means our output is False Positive.

accurate result = TP & TN

Accuracy: —

how can I find the accuracy of my confusion matrix?

formula for accuracy

lets assume TP= 3 , TN= 1, FP= 2, FN=1

3+1/3+2+1+1 = 4/7= 57%

This means our model has 57% accuracy.

Precision & Recall:-

we have a dataset here 0 value is 900 and 1 is 100 and total = 1000.

here we can see we have an imbalanced dataset. so we can’t apply accuracy directly here. So here we can use Precision & Recall.

Precision:-

TP/TF+FP

Let’s assume we have an email that is not spam. Our model is predicting spam in a very bad condition. In this type of scenario, we can use Precision because we need to decrease the false positive rate.

Recall:-

TP/TP+FN

let’s assume we have to predict cancer. One person has cancer or not.

one person has cancer and our model predicting not cancer so its bad condition we need to reduce False negative here. we can use recall here

So, for human prediction, we can use recall, and for company prediction, we can use a precision matrix.

F- Beta Score:-

formula for f -Beta score

Take another example. We need to predict whether the stock market will crash tomorrow or not. In these types of scenarios, we can use the F-Beta score; this is important for both individuals and companies.

if FP & FN are both important we will use B value = 1

(H1)(P*R)/B²*[P+R]

if FP is more important than FN. FP>>FN we will use B value = 0.5

(1+(0.25)) (P*R)/(0.25)[P+R]

if FN >>FP we can use B value is 2.

1+4 (P*R)/(4)[P+R]

Python’s Gurus🚀

Thank you for being a part of the Python’s Gurus community!

Before you go:

  • Be sure to clap x50 time and follow the writer ️👏️️
  • Follow us: Newsletter
  • Do you aspire to become a Guru too? Submit your best article or draft to reach our audience.

--

--

PRAVESH GREWAL
Python’s Gurus

Artificial intelligence, Computer-Networking ,Python & Cyber-Security