AUC ROC Curve

AUC ROC Curve

Gajendra
4 min readNov 23, 2022

One of the most common measurement techniques used to analyze the performance of a model is AUC ROC. It evaluates the performance of categorical, binary, and multi-class models at different thresholds.

In general, we use AUC ROC when we have unbalanced data. Our model is trained at various thresholds, and the best one is chosen based on its AUC score.

Threshold

Looking at threshold in terms of binary classification is the simplest way to understand its importance.

Balanced Dataset

In a balanced dataset, where there are equal samples for positive and negative classes, a threshold of 0.5 is used.

Any output below 0.5 will be labeled negative, while any output greater than 0.5 will be labeled positive.

By setting the threshold we are giving equal weights to both positive and negative classes.

Balanced Dataset

Unbalanced Dataset

The majority of samples in an unbalanced dataset belong to either the positive or negative group.

Let’s assume we have more negative samples than positive.

As with a balanced dataset, if we set the threshold to 0.5, our model may do poorly due to an unbalanced dataset. It is more likely that the model will predict more negative results than positive results if it learns more from negative samples than from positive samples.

The threshold should always be adjusted based on the distribution of the sample in order to overcome this issue. Therefore, we should lower the threshold to increase the probability of predicting a positive outcome.

Unbalanced Dataset

Although the number of positive predictions will increase, the number of false positives will also increase.

ROC

ROC a.k.a. Receiver Operating Characteristic curve summarizes, TPR vs FPR, all of the confusion matrices that each threshold produce.

ROC (Google Images)

AUC

AUC a.k.a. Area Under Curve measures how well a model is able to distinguish between classes. AUC provides measure of performance across all possible classification threshold.

AUC (Google Images)

AUC ROC Curve

To plot the ROC we calculate TPR vs FPR at different thresholds between 0 to 1.

Binary Classification

For the binary classification, since we have only have two classes we can easily calculate TPR and FPR based on the confusion matrix.

ROC — Single Model

Note: There are multiple models that can be trained with different hyperparameters, algorithms, etc. We can apply the AUC ROC objective metric to each one, and then choose the best model based on the AUC score.

ROC — Multiple Model (Google Images)

For example, in the ROC plot above we can see that the Gradient Boosting has higher AUC and can be selected as a final model.

Multi-class Classification

To calculate confusion matrices for multi-class models, we use one vs rest methodology. It will give us the same number of ROCs as the number of classes.

ROC — Multiclass (Google Images)

While AUC ROC is not usually used for selecting a multi-class classification model, it is very useful for understanding the classes the model struggles to describe, and what features can be added or removed to improve the model’s performance.

PRC

PRC a.k.a. Precision Recall Curve shows the tradeoff between precision and recall at different threshold. Just like ROC, it is often used in situation where we have unbalanced dataset and specifically when positive class is on minority.

A precision-recall curve, unlike the ROC Curve, focuses solely on the performance of a classifier on the positive (minority class).

PRC (Google Images)

“A high AUC represents both high recall and high precision, where high precision relates to a low false positive rate, and high recall relates to a low false negative rate” — Scikit — Learn

I hope this article provides you with a basic understanding of AUC ROC Curve.

If you have any questions or if you find anything misrepresented please let me know.

Thanks!

--

--

Gajendra

| AWS MLS, SAA, CLF | MIT - ADSP | Software Engineer | Data Scientist | Machine Learning | Artificial Intelligence | Hobby Blogger |