Fooled by conditioning

Kristian Wichmann
5 min readSep 5, 2018

--

It’s a well-known fact that our intuition about probability is often dead wrong! The expectation values for a binary classifier, such as a disease screening, is one such example.

This is the first story in a series about conditioning, Bayes’ formula and the Bayesian interpretation of probability.

A screening example

A screening for a disease gives an answer to the question “Does the patient have the disease?” A “yes” is called a positive and a “no” a negative.

Consider now a specific screening for a disease which occurs in 1% of the population. At first glance, it looks like a rather good estimator:

  • For patients who actually have the disease, the screening gives a correct result 80% of the time.
  • For patients who actually does not have the disease, the screening gives a correct result 95% of the time.

Gut feeling

Now, if you are screened and get a positive result, then what is the probability of you actually having the disease? What is your gut feeling?

If you’re anything like the author of this article, your immediate answer will be 80%. Smarter people than me have been similarly fooled. Fooled, because indeed it turns out that the true answer is much lower than most expect!

Some terminology

It turns out that the 1% occurrence rate of the disease given is actually rather crucial for the answer to the question! This is known as the prevalence of the disease. The other two probabilities in the problem have technical terms as well:

  • The probability of a correct screening for a patient who actually has the disease is known as the sensibility. In our example the sensibility is 80%. (Sensibility is also referred to as recall).
  • The probability of a correct screening for a patient who actually does not have the disease is known as the specificity. In our example the specificity is 95%.

Note that both of these are actually conditional probabilities. Hence the title of this story.

A representative sample

In order to clearly see why the gut feeling above is wrong, let us consider a representative sample of 10,000 people. Since only 1% of the population actually has the disease, this means that the 10,000 can be divided into two groups:

  • 100 people who actually have the disease.
  • 9,900 people who actually do not have the disease.
Only one in a hundred people actually have the disease.

False negatives and false positives

Since the sensitivity of the test is 80%, 20% of the time a person who actually has the disease is screened the result is incorrect, i.e. the result is negative. This is known as a false negative.

Similarly, since the specificity is 95%, 5% of the time a person who actually does not have the disease is screened, the result is incorrect, i.e. the result is positive. This is a false positive.

A false positive is also known as a type I error, and a false negative as a type II error.

  • The rate of type I errors (20% in our case) is denoted by the Greek letter alpha.
  • The rate of type II errors (5% in our case) is denoted by the Greek letter beta.

The true positives, false positives, true negatives, and false negatives are often summed up in a confusion matrix.

Confusion matrix

What does this mean for our sample?

Back in our sample, there’s 100 people actually having the disease. Of these 80 correctly test positive, while the remaining 20 are false negatives (or type II errors).

Of the 9,900 people actually not having the disease 95%, which 9405 individuals correctly test negative. The remaining 5%, which is 495 individuals, are false positives (or type I errors).

Confusion matrix for the sample

So because the second groups is so much larger because of the low prevalence, the number of false positives vastly outnumbers the true ones. Herein lies the essential problem.

The light green section contains way fewer individuals than the light red one!

Finally, an answer

Now we are ready to answer the question: Given that the screening is positive, what is the probability of actually having the disease?

First, how many of 10,000 test positive? The true positives number 80, while there are 495 false negatives. So 80+495=575 in total.

This means that the probability of actually having the disease is 80/575, or about 14%. Much lower than 80%!

Conclusion

The prevalence of diseases are usually low, which means that the number of false positives is very often much larger than the number of true ones. This implies a low predictive power with regard to revealing the occurrence of the disease in a patient.

Teaser for next installment: Bayes’ theorem.

Next time, we’ll take a deeper look at conditioning, and see how it leads to Bayes’ theorem.

--

--

Kristian Wichmann

Physicist turned data scientist. Writes about machine learning, math, and creative coding.