Evaluation Metrics Part 2

For Classification Models!!

Siladittya Manna
The Owl
4 min readJun 21, 2020

--

Image source

In the Part 1 of the Evaluation Metrics series, we looked at some of the evaluation metrics which besides giving an insight into the model performance aslo serves as the building blocks of several other metrics used in DIagnostic testing of Binary Classification models. In this Part 2 of the Evaluation Metrics series, we are going to look into some of those metrics and also see their python implementation.

There are several other metrics which are used in Diagnostic testing of Binary Classification models.

An overview of such metrics can be obtained from this picture

Image source

Let us discuss in brief, the other metrics in this picture

Prevalence

Prevalence is the fraction of the total population, that is labeled positive.

Negative Predictive Value

Negative Predictive Value or NPV is the proportion of negatively labeled samples which are correctly predicted negative.

Positive and Negative Predictive Value can again be expressed in terms of prevalence, specificty and sensitivity as

False Discovery Rate

FDR

FDR is the proportion of positively predicted samples which are originally labeled negative. In other words, it is the proportion of false positives out of all the positively predicted samples.

False Omission Rate

FOR

FOR is the proportion of negatively predicted samples which are originally labeled positive. In other words, it is the proportion of false negatives out of all the negatively predicted samples.

False Positive Rate

FPR, Fall-out

FPR is the proportion of negatively labeled samples which are incorrectly predicted positive.

False Negative Rate

FNR

FNR is the proportion of positively labeled samples which are incorrectly predicted negative.

Positive Likelihood Ratio

LR+

LR+ is the ratio of the probability of a sample being predicted positive given that the sample is originally labeled positive to the probability of the sample being predicted positive given that the sample is originally labeled negative. In real life scenario, LR+ denotes the probability of a person who has a disease testing positive divided by the probability of a person who does not have the disease testing positive. Higher the value of LR+, the more likely a positive test result is a true positive. On the other hand, LR+< 1 indicates that a positive test result is likely to be a false positive.

Negative Likelihood Ratio

LR-

LR- is the ratio of the probability of a sample being predicted negative given that the sample is originally labeled positive to the probability of the sample being predicted negative given that the sample is originally labeled negative. In real life scenario, LR- denotes the probability of a person who has a disease testing negative divided by the probability of a person who does not have the disease testing negative.

Diagnostic Odds Ratio

DOR

DOR is the measure of the effectiveness of a diagnostic test (or model), and is defined as the ratio of the odds of the test (prediction) being positive if the sample (subject) is originally positively labeled relative to the odds of the test (prediction) being positive if the sample (subject) is originally negatively labeled.

Check out the Part 3 of this series on ROC curve and AUC score and Part 4 on how to measure uncertainty in evaluation metrics and how to combine all these metrics into one place for multi-class classification problems.

--

--

Siladittya Manna
The Owl

Senior Research Fellow @ CVPR Unit, Indian Statistical Institute, Kolkata || Research Interest : Computer Vision, SSL, MIA. || https://sadimanna.github.io