Imagine if we were writing a program to detect cancer from an MRI image. If we were detecting cancer, we’d rather have false positives than false negatives. False negatives would be the worse possible case — that’s when the program told someone they definitely didn’t have cancer but they actually did.
We need to look more closely at the numbers than just the overall accuracy. To judge how good a classification system really is, we need to look closely at how it failed, not just the percentage of the time that it failed.