Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Follow publication

Real world occurrences versus model confidence scores. Image created with Dall·E by the author.

Member-only story

You Think 80% Means 80%? Why Prediction Probabilities Need a Second Look

Understand the gap between predicted probabilities and real-world outcomes

8 min readJan 14, 2025

--

How reliable are probabilities predicted by a machine learning model? What does a predicted probability of 80% mean? Is it similar to 80% chance of an event occurring? In this beginner friendly post, you’ll learn the basics of prediction probabilities, calibration, and how to interpret these numbers in a practical context. I will show with a demo how you can evaluate and improve these probabilities for better decision-making.

What do prediction probabilities represent?

Instead of calling model.predict(data), which gives you a 0 or 1 prediction for a binary classification problem, you might have used model.predict_proba(data). This will give you probabilities instead of zeroes and ones. In many data science cases this is useful, because it gives you more insights. But what do these probabilities actually mean?

A predicted probability of 0.8 means that the model is 80% confident that an instance belongs to the positive class. Let’s repeat that: the model is 80% confident that an instance belongs to the positive class. So it doesn’t mean: there is an 80% real-world likelihood of the event occurring

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Hennie de Harder
Hennie de Harder

Written by Hennie de Harder

📈 Data Scientist & ML Engineer 💡 Simplifying complex topics ✨ Sharing fun side projects 💻 Working at IKEA and BigData Republic 🐈 Love math, cats, & running

Responses (2)