Accuracy Vs AUC-ROC

Kaveti Naveenkumar

Follow

Published in

Nerd For Tech

2 min readJun 25, 2021

--

In this post I will talk about accuracy and area under ROC curve. Both of these metrics are useful to validate a classification model using historical data for which the target variable is known.

Accuracy:

Accuracy is the simplest validation metric to compute and understand, it is the proportion of correct classifications. Incase of uniformly distributed labels (~50% positive and ~50% negative) then accuracy can be useful to validate the model but incase of extremely imbalanced classes like, 98% negatives and 2% positives then it may lead us to wrong conclusions.

Confusion matrix of a binary classification model:

Accuracy: (TP + TN)/(TP+TN+FP+FN)

Two major reasons why accuracy is not useful always:

It is threshold variant, highly depends on the chosen threshold value
It is scale variant, multiplying probabilities with a scalar impacts accuracy score

Area under ROC:

Area under ROC curve is very useful metric to validate classification model because it is threshold and scale invariant. ROC plots FPR against TPR at different threshold values.

TPR (True Positive Rate): TP/(TP+FN)
FPR (False Positive Rate): FP/(FP+TN)

ROC plots FPR in the X-axis and TPR in the Y-axis and each point in the plot corresponds to a threshold value.

At threshold 0, model predicts negative class for all data points and hence FPR and TPR both are zero
At threshold 1, model predicts positive class for all data points and hence FPR and TPR both are one

Orange curve in the above plot is the ROC curve and Area under this curve can be used to validate the classification model.

AUC-ROC is invariant to threshold value, because we are not selecting threshold value to compute this metric
AUC-ROC is invariant to scale, because multiplying the probability scores with a scalar value does not impact this metric (you can check this by yourself)

Accuracy Vs AUC-ROC

Accuracy:

Confusion matrix of a binary classification model:

Two major reasons why accuracy is not useful always:

Area under ROC:

References:

Written by Kaveti Naveenkumar