What is Precision, Recall, F1-score and Confusion Matrix?

Dami Sparks
Analytics Vidhya
Published in
3 min readDec 9, 2020

I believe it is a common topic that is easily mixup. However, I am sharing how I understood these concepts because it took me a while to grasp these concepts.

First, let’s start with the confusion matrix which is also known as an Error matrix. This is a fundamental concept to understand in the field of machine learning.

  • The confusion matrix displays the number of true positives, true negatives, false positives, and false negatives given some number of input data points (usually n)

From the image above, n is the input data points. Here we have 2 classes which are :

  • Actual No: Class 1
  • Actual Yes: Class 2

Confusion matrix helps you understand the precision and recall of your model. How? I will come back to this. Let's talk about precision and recall.

Precision and Recall are just different metrics for measuring the “success” or performance of a trained model.

Precision shows how often the models correct when it predicts the positive label. That is the number of true positives (in this case) divided by all positives and will be higher when the amount of false positives is low. The formula is True Positives (TP) over Total Predicted Positives (TP + FP). FP means False Positive.

False Positive is where the model incorrectly indicates the presence of a condition.

Checking the example, the actual is No but the model predicted Yes.

So in the above example, that would be 45 over 60 (where 45 is TP and FP is 15 = 60 )for a precision score of 75%.

Recall shows how many of the True Positives your model predicted. The formula is True Positives over Total Real Positives(TP + FN). That is the number of true positives over true positives plus false negatives and will be higher when the number of false negatives is low.

False Negative means where the model incorrectly fails to indicate the presence of a condition when it is present. (read Type I & II error)

So here that’s 45 out of 50 for a score of 90% for a recall.

Going back to what I said before, how confusion matrix helps you understand the precision and recall of your model?

Simply, it helps by counting out the True and False positives, as well as the True and False negatives.

Next, let’s talk about F1-Score.

The F1 score is quite a combination of precision and recall. Because there are often trade-offs and aiming for higher precision or higher recall, you can aim to increase the F1 score.

The formula is shown in the image below.

f1 score formula

F1 score at its highest is 1 indicating perfect precision and recall, at its lowest possible value is 0.

In the example given at the beginning of the article, the F1 score for the example is 0.83 approximately.

Thank you for reading.

--

--

Dami Sparks
Analytics Vidhya

Software Engineer with an eye for design | Let’s get in touch 👉 linkedin.com/in/damisparks