Precision and recall

Lidet Tefera
2 min readJul 5, 2020

--

Precision and Recall are two of the most basic evaluation metrics available to us. Precision measures how precise the predictions are, while Recall indicates what percentage of the classes we’re interested in were actually captured by the model.

Precision is the number of correct positive results divided by the total predicted positive observations.

Precision=Number of True Positives/Number of Predicted Positives

Recall is a metric that quantifies the number of correct positive predictions made out of all positive predictions that could have been made.

Recall=Number of True Positives/Number of Actual Total Positives

Unlike precision that only comments on the correct positive predictions out of all positive predictions, recall provides an indication of missed positive predictions. In this way, recall provides some notion of the coverage of the positive class.

Which one is better?

A classic Data Science interview question is to ask “What is better — more false positives, or false negatives?” This depends on the problem, sometimes, the model may be focused on a problem where False Positives are much worse than False Negatives, or vice versa. For instance, detecting credit card fraud. A False Positive would be when our model flags a transaction as fraudulent, and it isn’t. This results in a slightly annoyed customer.

On the other hand, a False Negative might be a fraudulent transaction that the company mistakenly lets through as normal consumer behavior. In this case, the credit card company could be on the hook for reimbursing the customer for thousands of dollars because they missed the signs that the transaction was fraudulent! Although being wrong is never ideal, it makes sense that credit card companies tend to build their models to be a bit too sensitive, because having a high recall saves them more money than having a high precision score.

--

--