Confusion Matrix — Explained via Rain & an Umbrella

Vinay Mimani
Analytics Vidhya
Published in
4 min readOct 6, 2019
Photo by Osman Rana on Unsplash

Ron’s job is to keep people informed about inclement weather, so that they can plan their days better. Everyday, Ron wakes up, observes the surroundings, and predicts if it will rain

Before people leave for work, they turn on the radio to listen to Ron’s prediction. They carry an umbrella based on what he predicts

Ron meets his nemesis

One day, Ron was challenged by a guy named Brick

Brick said : “Ron, I challenge you to a duel, whoever is better at predicting rain will win”

Ron accepted, obviously

Taking stock of reality

In real world, there are only 2 outcomes possible (think Classification problems)

  • It rained
  • It didn’t rain

Let’s visualise this as a 2x2 quadrant, where we map both what happens in reality and what Ron predicts

There are a total of 4 scenarios

  • Ron said “it will rain”, and it rained — HIT (True Positive)
  • Ron said “it will not rain”, and it rained — MISS (False Negative)
  • Ron said “it will rain”, and it didn’t rain — FALSE ALARM (False Positive)
  • Ron said “it will not rain”, and it didn’t rain — CORRECT REJECTION (True Negative)

Duel time

To find the winner, people started asking the question :

On all the days that it rained, how many days did Ron get right

Hit Rate = ( No of times Ron said it will rain / No of times it really rained )

Based on the observation below, Ron’s Hit Rate = 2/5 = 0.4, which means on all the 5 days that it rained, Ron could only get 2 days right

Score card for Ron
Score card for Brick

Brick’s Hit Rate = 2/5 = 0.4

Hit rate is also known as Recall OR Sensitivity OR Statistical Power

Both Ron and Brick ended up with the same Hit Rate. Interesting!

Tie breaker

Ron & Brick’s equal Hit Rate becomes talk of the town. Then a good folk had a smart moment.

Said she : “Yesterday, Ron had predicted that it would rain, but it didn’t! I had to carry an umbrella unnecessarily. Shouldn’t we also check how many times did they raise a false alarm and made us carry umbrellas for no reason?”

It made sense, because no one likes carrying an umbrella unnecessarily

So now, they all started asking one more question :

On all the days that Ron said it will rain, did it actually rain

Precision = ( No of times it rained/No of times Ron predicted it will rain)

Score Card for Ron

Ron’s Precision = 2/5 = 0.4

Score Card for Brick

Brick’s Precision = (2/3) = 0.66

So, who’s better at predicting rain?

It depends…

Imagine the following scenarios

  • Ron keeps saying it won’t rain, but it rains. As a result, you get wet more often
  • Ron keeps saying it will rain, but it doesn’t rain. As a result, you often end up carrying an umbrella unnecessarily (you won’t get wet though)

Which scenario would you prefer?

Scenario 1:

You are ok with getting wet more often

You are ok with Ron Missing more often

You don’t care so much about Hit Rate

You care about Precision

Scenario 2:

You are ok with frequently carrying an umbrella for no reason

You are ok with Ron raising False Alarms more often

You don’t care so much about Precision

You care about Recall

Of-course the above is an oversimplification. There are instances where you would care both about Precision and Recall. That, is however outside the scope of this article.

Read this excellent article which inspired this verbose post, and thanks for reading! Leaving feedback is strongly encouraged…

Confused by The Confusion Matrix

--

--

Vinay Mimani
Analytics Vidhya

No one knows you better than your observer | Founder - @trbble, Currently — @Gojek