ROC Curve|Classification Metrics|Why metrics can’t replace your common sense?

Yash Patil
Analytics Vidhya
Published in
5 min readMar 12, 2020

Metrics are blind in nature. They are blindly optimized for a single quantity which consequently leads to entirely unsuitable outcomes. Some well-established media sites often optimize metrics for “user clicks”. Some enthusiastic journalist and young advertisers realize that user attracts more towards titles such as “This *** will change your life”. This gives birth to clickbait. Although these headlines induce users to click on it but also turn off users and lead them to avoid spending time on “clickbait filled” sites. Hence optimizing for the user’s clicks ends up getting their disinterest.

In short, optimizing metrics for one quantity often comes at the cost of a separate quantity. So, it is important to choose the right quantity for the optimization of your metrics.

Binary Classification Metrics

Lets us understand this with the help of an example. Consider a case with four inputs and a single binary discrete output, which tells us whether a person is diabetic or non-diabetic. The ‘0’ represents the person is non-diabetic and ‘1’ represents the person is diabetic. After building the logistic regression model, we have the following confusion matrix.

Confusion Matrix

Classification accuracy

Classification accuracy is the ratio of correct predictions to the total number of predictions.

Sensitivity/Recall

Sensitivity or recall is the ratio of correct positive predictions to the total no. of positive predictions. This is also called the True Positive Rate.

False Positive Rate

The false-positive rate is the ratio of negative predictions that were determined to be positive to the total number of negative predictions.

Precision

Precision is the ratio of correct predictions to the total no. of predicted correct predictions.

Specificity

Specificity is the ratio of correct negative predictions to the total no. of negative predictions.

ROC curve

A ROC curve is used to visualize the performance of a binary classifier, meaning a classifier with two output classes. The curve plots the True Positive Rate (Recall) against the False Positive Rate (also interpreted as 1-Specificity).

The TPR and FPR values are calculated at different cutoff values which are represented by the curve. The cutoff which is nearest to “1” on Y-axis gives us the optimum cutoff value of the curve. Just by looking at the curve it is very difficult to find out the optimum cutoff value, but we can use the following method to do that.

Methods to find the ‘optimal’ threshold point

Three criteria are used to find the optimal threshold point from the ROC curve. The first two methods give equal weight to sensitivity and specificity and impose no ethical, cost, and no prevalence constraints. The third criterion considers cost which mainly includes the financial cost for correct and false diagnosis, cost of discomfort to a person caused by treatment, and cost of further investigation when needed. These three criteria are known as points on curve closest to the (0, 1), Youden index, and minimize cost criterion, respectively.

If sn and sp denote sensitivity and specificity, respectively, the distance between the point (0, 1) and any point on the ROC curve is d =

To obtain the optimal cut-off point to discriminate against the disease with the non-disease subjects, calculate this distance for each observed cut-off point, and locate the point where the distance is minimum. Most of the ROC analysis software calculates the sensitivity and specificity at all the observed cut-off points allowing you to do this exercise.

The second is the Youden index that maximizes the vertical distance from the line of equality to the point [x, y] as shown in Figure 3. The x represents (1- specificity) and y represents sensitivity. In other words, the Youden index J is the point on the ROC curve which is farthest from the line of equality (diagonal line). The main aim of the Youden index is to maximize the difference between TPR (sn) and FPR (1 — sp) and little algebra yields J = max[sn+sp]. The value of J for a continuous test can be located by doing a search of plausible values where the sum of sensitivity and specificity can be maximum.

Among these methods, the Youden index is a more commonly used criterion because this index reflects the intention to maximize the correct classification rate and is easy to calculate. Many authors advocate this criterion.

This is part 1. In part 2 we will see its interpretation using the ROCIT package in R.

Part 2 is here

--

--