Classification : ROC vs PR Scores

Divyesh Bhatt
The ML Classroom
Published in
3 min readMar 22, 2023
Photo by Nguyen Dang Hoang Nhu on Unsplash

As a machine learning practitioner, you may often come across the terms ROC and PR scores. These scores are used to evaluate the performance of classification models. ROC and PR scores are often used interchangeably, but they are not the same. In this blog, we will discuss the differences between the two and their formulas.

ROC Score:

ROC stands for Receiver Operating Characteristic. The ROC curve is a graphical representation of the performance of a classification model. The ROC curve plots the true positive rate (TPR) against the false positive rate (FPR) at different classification thresholds. The area under the ROC curve (AUC-ROC) is a metric commonly used to measure the overall performance of a classification model.

The TPR is defined as the ratio of true positive predictions to the total number of positive samples. Mathematically,

TPR = TP / (TP + FN)

where TP is the number of true positives, and FN is the number of false negatives.

The FPR is defined as the ratio of false positive predictions to the total number of negative samples. Mathematically,

FPR = FP / (FP + TN)

where FP is the number of false positives, and TN is the number of true negatives.

The ROC curve is plotted by varying the classification threshold and calculating the TPR and FPR at each threshold. The AUC-ROC is calculated by finding the area under the ROC curve.

PR Score:

PR stands for Precision-Recall. The PR curve is another graphical representation of the performance of a classification model. The PR curve plots the precision against the recall at different classification thresholds. The area under the PR curve (AUC-PR) is a metric commonly used to measure the overall performance of a classification model.

Precision is defined as the ratio of true positive predictions to the total number of positive predictions. Mathematically,

Precision = TP / (TP + FP)

where TP is the number of true positives, and FP is the number of false positives.

Recall is defined as the ratio of true positive predictions to the total number of positive samples. Mathematically,

Recall = TP / (TP + FN)

where TP is the number of true positives, and FN is the number of false negatives.

The PR curve is plotted by varying the classification threshold and calculating the precision and recall at each threshold. The AUC-PR is calculated by finding the area under the PR curve.

Differences between ROC and PR Scores:

ROC and PR scores are both used to evaluate the performance of classification models, but they differ in the following ways:

  • ROC curves plot TPR against FPR, while PR curves plot precision against recall.
  • ROC curves are useful when the classes are balanced, while PR curves are useful when the classes are imbalanced.
  • A high ROC score indicates that the model has a high true positive rate and a low false positive rate, while a high PR score indicates that the model has a high precision and recall.
  • The ROC curve is insensitive to the imbalance between the classes, while the PR curve is sensitive to the imbalance between the classes.

In conclusion, ROC and PR scores are important metrics to evaluate the performance of classification models. The choice between the two depends on the nature of the problem and the balance between the classes.

--

--