Taxonomy of Uncertainty

Adam Zivner
5 min readMay 7, 2017

--

Episode 31 of my favorite podcast Rationally Speaking brought to my attention interestingly named paper WARNING: Physics Envy May Be Hazardous To Your Wealth.

The paper is quite interesting overall even for lay person (especially first few chapters), but most relevant here is the 3rd chapter which introduces Taxonomy of uncertainty. Look at the following questions and try to think how you would answer them:

  • what’s the probability of winning on black in the game of roulette?
  • what’s the probability of getting 6 on the irregular dice?
  • what’s the probability of getting 6 on a dice, but you don’t know what kind of dice is it (and the kind of dice is changing from time to time)
  • what is the probability that the casino will go out of business in the next 5 years?
  • what is the probability that there will be nuclear war in the next 5 years?

Looking at these questions, there seems to be clear qualitative difference between the successive questions — even though all of them talk about probability, the first one c an be answered with higher confidence than the the second one, third one with even less confidence etc.

Taxonomy of uncertainty

Following categories are taken from 3rd chapter of the paper mentioned above. Names of the categories are a bit different than in the paper to make it more memorable for me.

Level 1 — Complete certainty

There’s no uncertainty, we have models with laws which dictate what happens without any room for randomness. Classical and relativistic physics are built this way. Given the initial conditions, you can predict the final outcome with perfect certainty.

This category doesn’t deal with uncertainty, but it’s still good to include it in the taxonomy as an extreme end of the spectrum.

Level 2 — Risk

What’s the probability of winning on black in the game of roulette

Roulette has 37 fields — from 0 to 36 (so 37 numbers). 18 numbers are black, 18 are red and one (number 0) is green. You can bet on either red or black, if the ball falls and stays on black, you win double the amount you bet. So what is the probability of winning on black in the game of roulette?

  1. There’s a total of 37 possible outcomes, each of them has equal probability
  2. 18 of them are blacks, i.e. the winning outcomes
  3. probability of winning is number of winning outcomes divided by number of total outcomes, i.e. 18 / 37 = 0.4865 (or 48.65%)

In this scenario, we know the exact model of (fair) roulette — we can calculate the probability exactly and with great confidence. This is the realm of most of probability theory where it is known as Knightian risk.

Level 3 — Known model, unknown parameters

What’s the probability of getting 6 on the irregular dice?

Dice is highly irregular. Here we know that the dice has 6 possible outcomes, but the trouble is that these outcomes are clearly not equally probable and we don’t know the probabilities of each outcome. In essence, we know the structure of the model (X possible outcomes with unchanging probability), but we don’t the parameters (probability of each outcome).

We might possibly use some complex math and physics to calculate the probability (note that this isn’t possible for a lot of otherwise similar situations), but there’s a simpler approach. Throw the dice hundred times, record the relative frequency of outcomes and you have rough probability. If you need better precision, throw some more. Law of large numbers “guarantees” that observed frequencies will converge to the real, actual probabilities.

Level 4 — Partially known model

What’s the probability of getting 6 on a dice, but you don’t know what kind of dice it is (and the kind of dice is changing from time to time)

Here we don’t know the model exactly, but we have some idea how the model can look like — there’s only a limited number of dices. If the intervals between switching the dices are long enough, you may be able to infer which dice is used for the time being (or narrow the set of possible dices consistent with the observed data). Then you might start getting data inconsistent with your beliefs (established model) — which might mean that either the dice changed or you were unlucky and believed in the wrong dice all along.

Another type of scenarios falling into this category are unknown complex / non-linear models for which we believe we can create simplified model approximations. These approximated / simplified models can be quite useful, but need to be taken with grain of salt, because they can fail miserably when the approximated model diverges significantly from the complex underlying model. An example of this are various economic theories which are supposedly simplified models of unknown complex underlying mechanisms which can often predict quite well, but fail at some special circumstances (e.g. unexpectedness of 2008 economic crisis). Black swan events fall into this category.

To summarize, uncertainty with partially known model is a situation where we have some model but we are aware that it doesn’t fit the real world (underlying data generating mechanism) perfectly.

Level 5 — Unknown model

What is the probability that there will be nuclear war in the next 5 years?

Here we have again unknown complex / non linear model, but we also don’t have any useful approximation and mostly we believe that either useful approximation doesn’t exists or it’s out of our reach. Collecting more data and thinking harder won’t significantly help here.

You can of course use some social science models and theories to produce such predictions, but we admit that they are mostly useless and simply out of touch with reality.

Difference between Partially known model and Unknown model is quite subjective — essentially in both cases we have some model(s) and we subjectively rate our confidence in ability of the model to approximate underlying model — either they are still useful (partially known model) or more of an illusion without any confidence in its predictions (unknown model).

Applications

So how is this useful?

The original paper introduces this taxonomy for classification of scientific disciplines and shows that they differ greatly — physics is mostly on Level 1 and Level 2, economics on Level 3 and 4, other social sciences on Level 4 and 5. Title “physics envy” refers to the observation that economists and other social scientists “envy” the confidence of physical models and its predictions. There are (sometimes successful) efforts to bring these sciences and individual theories upwards in the hierarchy, in a sense to elevate them to the certainty (and prestige which comes with it) of physics.

But this taxonomy can be also useful for sharper thinking about probability and confidence. When dealing with uncertainty in both day to day and intellectual life, it’s useful to conceptually categorize the model you’re using to estimate the probability into this taxonomy, this will give you a rough idea of the confidence in your prediction.

--

--