Manpreet Saluja

5 min readApr 17, 2019

--

Understanding AUC and ROC Curve

In Machine Learning, performance measurement is an essential task. So when it comes to a classification problem, we can count on an AUC — ROC Curve. When we need to check or visualize the performance of the multi — class classification problem, we use AUC (Area Under The Curve) ROC (Receiver Operating Characteristics) curve. It is one of the most important evaluation metrics for checking any classification model’s performance. It is also written as AUROC (Area Under the Receiver Operating Characteristics)

Note: For better understanding, I suggest you to read my article about Confusion Matrix.

This blog aims to answer following questions:

1. What is AUC — ROC Curve?

2. Defining terms used in AUC and ROC Curve.

3. How to speculate the performance of the model?

4. Relation between Sensitivity, Specificity, FPR and Threshold.

5. How to use AUC — ROC curve for multiclass model?

What is AUC — ROC Curve?

AUC — ROC curve is a performance measurement for classification problem at various thresholds settings. ROC is a probability curve and AUC represents degree or measure of separability. It tells how much model is capable of distinguishing between classes. Higher the AUC, better the model is at predicting 0s as 0s and 1s as 1s. By analogy, Higher the AUC, better the model is at distinguishing between patients with disease and no disease.

The ROC curve is plotted with TPR against the FPR where TPR is on y-axis and FPR is on the x-axis.

AUC — ROC Curve [Image 2] (Image courtesy: My Photoshopped Collection)

Defining terms used in AUC and ROC Curve.

TPR (True Positive Rate) / Recall /Sensitivity

Image 3

Specificity

Image 4

FPR

Image 5

How to speculate the performance of the model?

An excellent model has AUC near to the 1 which means it has good measure of separability. A poor model has AUC near to the 0 which means it has worst measure of separability. In fact it means it is reciprocating the result. It is predicting 0s as 1s and 1s as 0s. And when AUC is 0.5, it means model has no class separation capacity whatsoever.

Let’s interpret above statements.

As we know, ROC is a curve of probability. So lets plot the distributions of those probabilities:

Note: Red distribution curve is of the positive class (patients with disease) and green distribution curve is of negative class(patients with no disease).

[Image 6 and 7] (Image courtesy: My Photoshopped Collection)

This is an ideal situation. When two curves don’t overlap at all means model has an ideal measure of separability. It is perfectly able to distinguish between positive class and negative class.

[Image 8 and 9] (Image courtesy: My Photoshopped Collection)

When two distributions overlap, we introduce type 1 and type 2 error. Depending upon the threshold, we can minimize or maximize them. When AUC is 0.7, it means there is 70% chance that model will be able to distinguish between positive class and negative class.

[Image 10 and 11] (Image courtesy: My Photoshopped Collection)

This is the worst situation. When AUC is approximately 0.5, model has no discrimination capacity to distinguish between positive class and negative class.

[Image 12 and 13] (Image courtesy: My Photoshopped Collection)

When AUC is approximately 0, model is actually reciprocating the classes. It means, model is predicting negative class as a positive class and vice versa.

Relation between Sensitivity, Specificity, FPR and Threshold.

Sensitivity and Specificity are inversely proportional to each other. So when we increase Sensitivity, Specificity decreases and vice versa.

Sensitivity⬆️, Specificity⬇️ and Sensitivity⬇️, Specificity⬆️

When we decrease the threshold, we get more positive values thus it increases the sensitivity and decreasing the specificity.

Similarly, when we increase the threshold, we get more negative values thus we get higher specificity and lower sensitivity.

As we know FPR is 1 — specificity. So when we increase TPR, FPR also increases and vice versa.

TPR⬆️, FPR⬆️ and TPR⬇️, FPR⬇️

How to use AUC ROC curve for multi-class model?

In multi-class model, we can plot N number of AUC ROC Curves for N number classes using One vs ALL methodology. So for Example, If you have threeclasses named X, Y and Z, you will have one ROC for X classified against Y and Z, another ROC for Y classified against X and Z, and a third one of Z classified against Y and X.

Thanks for Reading.

I hope I’ve given you some understanding on what exactly is the AUC — ROC Curve. If you like this post, a tad of extra motivation will be helpful by giving this post some claps 👏. I am always open for your questions and suggestions. You can share this on Facebook, Twitter, Linkedin, so someone in need might stumble upon this.

You can reach me on :-https://www.linkedin.com/in/manpreet-saluja-data-analyst-data-scientist-3823a8138/

Machine Learning

Manpreet Saluja

Written by Manpreet Saluja

0 Followers

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams