Understanding Receiver Operating characteristic (ROC) curve

Harshal Kothawade
Nerd For Tech
Published in
4 min readAug 6, 2021

--

Introduction

A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. ROC analysis provides a systematic tool for quantifying the impact of variability among individuals’ decision thresholds. The method was originally developed for operators of military radar receivers starting in 1941, which led to its name.

History of ROC

During world war II, soldiers deciphered a blip on the radar screen as a German bomber, a friendly plane, or just noise. As radar technology advanced during the war, the need for a standard system to evaluate detection accuracy became apparent. ROC analysis was developed as a standard methodology to quantify a signal receiver’s ability to correctly distinguish objects of interest from the background noise in the system. The term receiver operating characteristic (ROC) originates from the use of radar during World War II.

An optimum observer required to give a yes or no answer simply chooses an operating level and concludes that the receiver input arose from signal plus noise only when this level is exceeded by the output of his likelihood ratio receiver. Associated with each such operating level are conditional probabilities that the answer is a false alarm and the conditional probability of detection. Graphs of these quantities called receiver operating characteristic, or ROC, curves are convenient for evaluating a receiver. If the detection problem is changed by varying, for example, the signal power, then a family of ROC curves is generated. Such things as betting curves can easily be obtained from such a family.

The earliest mention of ROC Curve — Abstract from Peterson, W., Birdsall, T., Fox, W. (1954). The theory of signal detectability, Transactions of the IRE Professional Group on Information Theory, 4, 4, pp. 171–212.

Why use ROC?

Use in Different Domains:

After World War II, ROC was soon introduced to psychology to account for perceptual detection of stimuli. ROC analysis since then has been used in medicine, radiology, biometrics, forecasting of natural hazards, meteorology, model performance assessment, and other areas for many decades and is increasingly used in machine learning and data mining research.

Radiologists face the task of identifying abnormal tissue against a complicated background. For instance, each radiologist has his or her own visual clues guiding them to a clinical decision as to whether the pattern variation of a mammogram indicates tissue abnormalities or just normal variation. The varying decisions make up a range of decision thresholds.

Machine Learning Perspective:

The ROC curve is an evaluation metric for binary classification problems. It is a probability curve that plots the True Positive Rate (TPR) against False Positive Rate (FPR) at various threshold values and essentially separates the ‘signal’ from the ‘noise’.

What is ROC?

As mentioned above, the plot between TPR and FPR is the ROC curve. In other words it is a graph between sensitivity and (1- Specificity). In the ROC curve, a higher X-axis value indicates a higher number of False positives than True negatives. While a higher Y-axis value indicates a higher number of True positives than False negatives.

What is AUC and interpretation of ROC?

Interpretation of ROC depends on the value of AUC. Let’s understand what AUC is.

The Area Under the Curve (AUC) is the measure of the ability of a classifier to distinguish between classes and is used as a summary of the ROC curve. The higher the AUC, the better the performance of the model at distinguishing between the positive and negative classes.

AUC = 1 (Best Possible Model)
  • When AUC = 1, then the classifier is able to perfectly distinguish between all the Positive and the Negative class points correctly.
  • However, if AUC had been 0, then the classifier would be predicting all Negatives as Positives, and all Positives as Negatives.
0.5 < AUC < 1 (General Case Scenario)
  • When 0.5<AUC<1, there is a high chance that the classifier will be able to distinguish the positive class values from the negative class values.
  • This is so because the classifier is able to detect more numbers of True positives and True negatives than False negatives and False positives.
AUC = 0.5 (Worst Model)
  • When AUC=0.5, then the classifier is not able to distinguish between Positive and Negative class points.
  • Meaning either the classifier is predicting random class or constant class for all the data points.

Code

Conclusion

ROC curve works as a decision maker for Machine Learning Classification tasks. With the help of AUC_ROC value one can easily understand how well the classifier is performing and understand the predictive power of the model.

--

--

Harshal Kothawade
Nerd For Tech

Data Scientist well versed in statistical learning, machine learning and deep learning algorithms. Passionate about data and visualizations.