Adding ROC-AUC curves using Python

Snekhasuresh
featurepreneur
Published in
2 min readOct 8, 2022

You’ve built your machine learning model — so what’s next? You need to evaluate it and validate how good (or bad) it is, so you can then decide on whether to implement it. That’s where the AUC-ROC curve comes in.

The name might be a mouthful, but it is just saying that we are calculating the “Area Under the Curve” (AUC) of “Receiver Characteristic Operator” (ROC). Confused? I feel you! I have been in your shoes. But don’t worry, I got u! The implementation is really easy in Python!

Train the Model:

In this example, I’m taking LogisticRegression and KNeighborsClassifier for classification.

from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier # logistic regression
model1 = LogisticRegression()# knn
model2 = KNeighborsClassifier(n_neighbors=4) # fit
modelmodel1.fit(X_train, y_train)
model2.fit(X_train, y_train)

Probability Prediction:

Now after training the models, predict the probability of the classifier. A machine learning classification model can be used to predict the actual class of the data point directly or predict its probability of belonging to different classes. The latter gives us more control over the result. We can determine our own threshold to interpret the result of the classifier.

# predict probabilities
pred_prob1 = model1.predict_proba(X_test)
pred_prob2 = model2.predict_proba(X_test)

Getting the ROC-AUC Scores:

Sklearn has a very potent method roc_curve() which returns the FPR, TPR, and threshold values

from sklearn.metrics import roc_curve # roc curve for models
fpr1, tpr1, thresh1 = roc_curve(y_test, pred_prob1[:,1],pos_label=1)
fpr2, tpr2, thresh2 = roc_curve(y_test, pred_prob2[:,1],pos_label=1)# roc curve for tpr = fpr
random_probs = [0 for i in range(len(y_test))]
p_fpr, p_tpr, _ = roc_curve(y_test, random_probs, pos_label=1)

The AUC score can be computed using the roc_auc_score() method of sklearn:

from sklearn.metrics import roc_auc_score 
# auc scores
auc_score1 = roc_auc_score(y_test, pred_prob1[:,1])
auc_score2 = roc_auc_score(y_test, pred_prob2[:,1])

Plotting the curves in Matplotlib:

The curves can be visualized by using the following code:

# matplotlibimport matplotlib.pyplot as plt
plt.style.use('seaborn') # plot
roc curvesplt.plot(fpr1, tpr1, linestyle='--',color='orange', label='Logistic Regression')
plt.plot(fpr2, tpr2, linestyle='--',color='green', label='KNN')
plt.plot(p_fpr, p_tpr, linestyle='--', color='blue')
# title
plt.title('ROC curve')
# x label
plt.xlabel('False Positive Rate')
# y label
plt.ylabel('True Positive rate') plt.legend(loc='best')
plt.savefig('ROC',dpi=300)
plt.show();

You get the following results:

There you go! You now know how to plot an AUC-ROC Curve in python!

Give yourself a pat on the back!

--

--