Precision/Recall Tradeoff

Amit Upadhyay
Analytics Vidhya
Published in
7 min readAug 10, 2020

Precision/Recall Tradeoff

Let’s understand the Precision and Recall before understanding the concept precision/recall tradeoff. To understand the precision and recall, let’s understand the Confusion Matrix first.

Confusion Matrix

This is the most important performance measure tools for Classification in machine learning. As we know, Classification is one of the supervised tasks in machine learning, where you provide the labeled data to the model or algorithm (Labeled data — Where you have the output or target or class against the features or input, ex shown in below table).

Labeled Data

In real time scenario, data will not be like this. We have to collect the data and we use the data preprocessing on the data and will make the data to be useful to fit in the classifier. Once the data is preprocessed. We divide the data into two parts, Training Dataset to train the model and Testing Dataset to test the performance of the data. Both sets will be created for both X and Y, where X is Feature and Y is Target or Class. We will train the Classifiers using Xtrain and Ytrain (Training Dataset). Once the model is trained with the labeled data. We will calculate the Ypredicted using the classifier on Xtest. We already have the actual Y value on Xtest as Ytest. So we need to check the performance of the model using the actual Y, which is Ytest and Ypredicted on Xtest using classifier.

Full DataSet

A much better way to evaluate the performance of a classifier is to look at the confusion matrix. We measure the performance of the model or classifier using Confusion Matrix. Below is the example of the confusion matrix. It has been created using Logistic Regression Classifier on Iris Data Set. Let us understand the confusion matrix.

· Each row represents the actual value or class or target.

· Each Column represents the predicted value or class or target.

· Shape of the Confusion Matrix is N × N, where N is the different Class or target, in this case, It is binary Classifier, hence it has Either 1 for positive identification (Iris-Virginica) and 0 for negative identification.

· Number of actual values or class is equal to no of predicted values or class.

· True Positive: When model predicted that instance A is classified as Iris-Virginica and it is actually an Iris-Virginica flower.

· True Negative: When model predicted that instance A is classified as NOT Iris-Virginica and it is actually NOT an Iris-Virginica flower.

· False Positive: When model predicted that instance A is classified as Iris-Virginica and it is actually NOT an Iris-Virginica flower.

· False Negative: When model predicted that instance A is classified as NOT Iris-Virginica and it is actually an Iris-Virginica flower.

Confusion Matrix

Precision: It is the accuracy of positive predictions.

Recall: It is the ratio of positive instance that are correctly detected. It is also called sensitivity.

Precision/Recall Tradeoff:

There are some cases you mostly care about the precision and in other context you mostly care about the recall.

  1. Example of High Precision à As we know we have multiple platform for video streaming like well known YouTube, you have restricted mode to restrict the violent and adult videos for the kids. So model focus on high precision {TP/(TP+FP)} by reducing the false positive. Means If model has classified the video is good for kids it must be safe to watch by kids. So, this can be done by reducing the false positive. Which will make higher Precision.

Let’s take another example of a model which detects the shop lifter in a mall, again aim of your model is to classify a customer as shoplifter, when he is actually a shop lifter, means high precision {TP/(TP+FP)} and false positive is low.

2) Example of High Recall à Let’s take an example, you are creating a model to detect a patient is having disease or not. In this case the aim of the model is to have high recall {TP/(TP+FN)} means a smaller number of false negative. If model predict a patient is not having a disease so, he must not have the disease. Think about the vice-verse, if it predicts you do not have the disease and you enjoy your life and later you come to know that you that disease at the last stage.

Another example is a model which detects a Loan Applicant is not a defaulter. Again, aim of the model is high recall {TP/(TP+FN)}. If model detects that applicant is not a defaulter so, applicant must not be a defaulter. So, model should reduce the false negative, which will increase the recall.

Unfortunately, you can’t have both precision and recall high. If you increase precision, it will reduce recall, and vice versa. This is called the precision/recall tradeoff.

In Scikit-Learn does not let you set the threshold directly, but it does give you access to the decision scores that it uses to make predictions. Classifier calculate the decision score for each instance and if the decision score is equal or higher than the threshold value, then it predicts positive class, means instance belongs to the class or target or output. If decision score is less the threshold then instance belongs to negative class or target or output.

Instead of calling the classifier’s predict() method, you can call its decision_function() method, which returns a score for each instance, and then make predictions based on those scores using any threshold you want:

Most of the Classifier uses a threshold equal to 0. The result will be same as calculated by predict() method, if you just calculate the Ypredicted as stated below. So, question arises that, which value should we take for threshold. Let’s see some graphs.

Precision and Recall Vs Threshold Graph

As you can see the graph, X axis denotes the threshold value and Y axis denotes the Precision and Recall value. As you can see, If you increase the threshold value Precision increases but Recall decreases and if you decrease the value then Recall increases but Precision decreases. At default threshold value (Zero), Precision is less than 80% and Recall is higher than 80%. Below screenshot has been taken from the same code on which we have used to draw this Graph.

Precision and Recall at Default Threshold Value (Zero)

So now we know, if we need higher Precision then, Threshold needs to set higher from the default threshold value (Zero) and in case if we need higher Recall then, Threshold needs to set lower from the default threshold value (Zero).

Another way to select a good precision/recall tradeoff is to plot precision directly against recall, as shown below.

Precision vs recall Graph

You can see that precision really starts to fall sharply around 80% recall. You will probably want to select a precision/recall tradeoff just before that drop — for example, at around 60% recall. But of course, the choice depends on your project.

Let’s take an example that you need high Precision and Precision should be equal or higher to 90%. As we know, default threshold is 0 and we have seen at default threshold Precision was 75%, hence to achieve higher precision, we need increase the bar. We will use numpy.argmax() function to search for the lowest threshold that gives you at least 90% precision.

Set Threshold value higher for High Precision

Let’s take the example where you need higher Recall. As we know, at the default threshold (Zero), the recall was around 83% and we know to achieve higher recall we need to decrease the bar or threshold. We will use numpy.argmin() function to search for the highest threshold that gives you at least 90% Recall.

Set Threshold value lower for High Recall

Thanks for reading the story. You can visit my YouTube channel for the implementation of the article.

Please follow me on Medium and Subscribe my YouTube Channel for future uploads.

YouTube Explanation Link: