Choosing the Right Metric for Evaluating Machine Learning Models — Part 2

Alvira Swalin
USF-Data Science
Published in
8 min readMay 2, 2018

--

Second part of the series focussing on classification metrics

In the first blog, we discussed some important metrics used in regression, their pros and cons, and use cases. This part will focus on commonly used metrics in classification, why should we prefer some over others with context.

Definitions

Let’s first understand the basic terminology used in classification problems before going through the pros and cons of each method. You can skip this section if you are already familiar with the terminology.

Source of Image: Wikipedia
  • Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN)
  • Specificity or TNR (True Negative Rate): Number of items correctly identified as negative out of total negatives- TN/(TN+FP)
  • Precision: Number of items correctly identified as positive out of total items identified as positive- TP/(TP+FP)
  • False Positive Rate or Type I Error: Number of items wrongly identified as positive out of total true negatives- FP/(FP+TN)
  • False Negative Rate or Type II Error: Number of items wrongly identified as negative out of total true positives- FN/(FN+TP)
Source of Image: Effect Size FAQs by Paul Ellis
  • Confusion Matrix
  • F1 Score: It is a harmonic mean of precision and recall given by-
    F1 = 2*Precision*Recall/(Precision + Recall)
  • Accuracy: Percentage of total items classified correctly- (TP+TN)/(N+P)

ROC-AUC Score

--

--