Member-only story
Please Make This AI Less Accurate
Demystifying the term “accuracy” in Data Science and Artificial Intelligence
Accuracy is one of those words that everyone intuitively assumes they understand and that most people believe is always better when it is higher.
With the rise in attention on Artificial Intelligence (AI) and the increasing awareness of lapses in reliability or accuracy of outputs, it is important for more people to understand that data products, such as AI, don’t follow the same rules of consistency or accuracy of other technologies.
The Confusion Matrix
To illustrate, let me introduce the concept of a “Confusion Matrix”. This will be very familiar to any Data Scientists that have built predictive models for classification purposes. It may be new to others but I find that the concept, the methodology and the human/business interaction involved are a useful case study to understand accuracy terminology in machine learning more broadly. It is a helpful visual tool to understand both nuance and trade-offs in these terms.
When we speak about total accuracy we mean the amount of correct predictions (the sum of the green boxes above) out of all total predictions (the sum of the four boxes above). So this is where you may here terms like “Our pregnancy test is 99% accurate”. It is talking about accuracy of all test predictions both those that say the user is and is not pregnant.
The nuance appears when you seek to understand in which of the two remaining red boxes that “inaccurate” percentage sits in.
For rare events, you could achieve a very high accuracy by predicting that the event never happens (no model required). However, for different models and use cases the cost or risk associated with inaccuracy is not equal or consistent.
Put plainly, a lower accuracy model may intentionally be that way because you want to reduce how often you mis-predict in one direction or another. In doing this you have to choose to compromise overall model accuracy.
Is it more risky to predict (or classify) that someone is pregnant and to be wrong or the other way around?