Choosing the right threshold is always challenging. A Receiver Operator Characteristic curve, or ROC curve, can help decide which value of the threshold is best.
The sensitivity, or true positive rate of the model, is shown on the y-axis. And the false positive rate, or 1 minus the specificity, is given on the x-axis.
The ROC curve always starts at (0, 0) corresponding to a threshold of 1. This means we have 0 sensitivity and we won’t catch a good care cases. But since our false positive rate is 0 as well, that means that we correctly label all the poor care cases.
The ROC curve always ends at (1, 1) which corresponds to a threshold of 0. So, the threshold decreases as we move from (0, 0) to (1, 1).
Let’s take an approximate point (0.6, 0.98) on the curve. This point signifies that we correctly label 98% of the cases with a false positive rate of 60%.
So what threshold to pick?
We should pick that threshold for the trade off that we want to make-
- Cost of failing to detect positives
- Cost of raising false alarms
If we are more concerned about labelling the good care cases (high sensitivity) then pick the threshold which minimises the false positive rate but has a very high true positive rate.
If we are more concerned with getting all the poor care cases right, having a high specificity or low false positive rate, pick the threshold that maximises the true positive rate while keeping the false positive rate really low.
An ROC curve is also a great way to visualise and compare different classifiers for the same classification problem.