Jessica Hullman and Matthew Kay (the MU Collective).
TLDR: The first in a series summarizing what we know about visualizing uncertainty in data. This post covers what we mean by “uncertainty” in visualization, and considers a few subtle, yet non-optimal, approaches. Later in this series we’ll cover a number of static and dynamic techniques and what empirical studies of these techniques have to say about how well they work.
Think about the last few visualizations you’ve come across, whether online, as part of your work, or in some other everyday setting. Did they visualize uncertainty?
If you said no, you’re not alone. While uncertainty representation is common in certain contexts (like statistical graphics or the insurance industry, to name a few), many of the visualizations we see in public-facing websites or reports don’t present uncertainty visually, and may not present it at all.
It’s easy to get overwhelmed by statistical details when trying to quantify uncertainty, or by technical details in reading research on deployments of uncertainty visualization techniques. Here we begin what will be a series of posts touring the research on uncertainty visualization. First, though, let’s talk about uncertainty itself. If you’re comfortable with probability distributions and varieties of estimation error, skip ahead to the next section.
What We Mean When We Say “Uncertainty”
Uncertainty can mean many things. When we talk about visualizing uncertainty, though, we usually mean visualizing information about different values the data could plausibly be. One place where uncertainty comes into play is when we have some target quantity in the world that we want to know (a parameter in stats speak), like how many jobs were created last month or how many people support a given political candidate. However, often the data we have are limited, and cannot exactly answer the questions we have. For example, we can’t possibly survey every single person in the country to ask if they support a political candidate. Instead, we have to ask a smaller group of people what they think (a sample of people) and extrapolate from there to the larger population. This introduces uncertainty.
Let’s make this more concrete through an example. Take monthly jobs figures, as shown in this visualization based on data from the Bureau of Labor Statistics (BLS):
At first glance, we might interpret the values in the chart as the outcome of asking every person in the country if they are unemployed, or perhaps some proxy we imagine would capture that information, like the proportion of workers in the U.S. on unemployment insurance. In reality, the government relies on the Current Population Survey (CPS), which is a survey of 60,000 eligible households — roughly 110,000 people — that are selected to be representative of the U.S. population. Employment information from this sample is weighted by demographics to give an estimate of the unemployment rate in the U.S. population at large.
While we can’t tell exactly what the true unemployment rate is, we can build statistical models to quantify our uncertainty around what we think it is. One such model (which is not necessarily representative of how the BLS models unemployment) could consider every possible value the true unemployment rate could be, then for each possible value, ask how likely is it that we would have gotten a sample that looks like the one we got. The model might even incorporate additional information, like how fast unemployment tends to change over time or whether or not there are seasonal effects (like summer jobs)¹. This process — depending on your school of statistical thought² — helps us infer a probability distribution for the unemployment rate at any given point in time. For example, we can show the uncertainty in what unemployment was in April 2019 as a probability distribution, according to our statistical model:
A probability distribution describes a set of possible values for the rate that are consistent to varying degrees with the data we saw and what our model assumes about how the world works. Values that are more consistent are assigned a higher probability. Here is the probability distribution for the unemployment rate in April 2019 in more detail:
Regions with greater area under the curve have a higher probability. With a probability distribution, we can pick any two points and ask, what is the chance that the unemployment rate was between those two values in April? For example, let’s pick the points 3.5% and 3.8%: 95% of the area under the curve is between 3.5% and 3.8%, so there was a 95% chance the unemployment rate was between about 3.5% and 3.8% in April.
A cautionary note: For the sake of explaining uncertainty, we created a statistical model that allows us to point to intervals, like 3.5% — 3.8%, and say that there is 95% chance that the interval contains the true unemployment value. Unfortunately, many intervals that we see, including from the BLS, are not so easily interpreted: the 95% instead refers to a property of the process used to construct the interval, not the chance that it contains the true value. Sound abstract and confusing? It is. We’ll talk more about how the way in which uncertainty is calculated changes how we interpret the meaning of intervals or probability distributions in the third post in this series.
The uncertainty we have discussed so far is uncertainty in what some value is or was. This type of uncertainty is sometimes called reducible: in principle, we could go out and survey every person in the United States to determine their unemployment status, reducing this probability distribution to a single point. However, some uncertainty is irreducible: even if we surveyed every single person in the United States to see what their unemployment status was in April, we can’t know exactly what their unemployment status will be in May until May happens. While these two kinds of uncertainty³ are different, both can be characterized with probability distributions⁴. Here is a predictive distribution for what unemployment will be in May:
Notice how we are much less certain about what unemployment will be than we were about what it was. This prediction combines many uncertainties from a statistical model — including our uncertainty in what the unemployment rate was in April and our uncertainty in how much the unemployment rate tends to change each month — to make a prediction.
Even more generally, we can talk about uncertainty in how well our mathematical description of reality matches the real world⁵. Statistical models use imperfect assumptions and approximations about how the world works, and we have uncertainty about how good those approximations are for a problem at hand. Such uncertainty may be qualitative, not quantitative. Communicating that kind of uncertainty is the subject of a future post; for now, we concentrate on communicating probabilistic uncertainty from a model, assuming we have done our due diligence to ensure the model is good enough.
Should We Care About Uncertainty?
Is uncertainty a big deal when we’re dealing with estimates like those we might see in the media? Often, yes. In fact, keeping with our economic theme, the government frequently revises estimates like the jobs numbers. When we’re talking unemployment rate, it is not unheard of for revisions to be 0.2% or 0.3%, corresponding to estimating the wrong status for roughly 1 million people. Whether revisions are made or not, there is a margin of error contributed by the sampling process. The Bureau of Labor Statistics reports that there is a 90 out of 100 percent change that the unemployment estimate for a given month in 2018 is within about 300,000 of the true unemployment rate we would obtain if we did a census on the entire population. This means there is a 10% chance that sampling error leads the estimate to be off by even more than 300,000.
If these differences still don’t sound that important, consider the way that unemployment figures are often used in the media, where a small increase or decrease in unemployment over a couple months can be framed as indicating a healthier economy or effective government. For example, one recent article claims “some 419,000 people entered the labor force in December in search of work, lured by easy-to-find jobs and rising wage.” Add a “±300,000” after that figure and it’s not so convincing. One might conclude that uncertainty should always be visualized in charts of unemployment over time, so that viewers don’t have to guess how much of a margin of error surrounds each estimate. While we wait for the Bureau of Labor Statistics to reach this conclusion, let’s talk about how we might do that.
Subtle (Invisible?) Uncertainty Communication
For starters, let’s assume that we have a distribution representing possible values that our data could take. Some communication approaches are subtle. In fact, one could say that the visualization below already communicates uncertainty. Can you tell how?
By rounding the unemployment rate estimate (rather than reporting, say 3.62471%), the visualization implies that the estimate is imprecise. People frequently round numbers in responding to survey questions to signal their uncertainty (Manski and Molinari 2010) and are quite good at catching on that a number is rounded and assuming it to mean that the reported value is subject to some margin of error (Campbell 2005). The problem is, it’s often ambiguous exactly how much rounding has been done. For example, a reported probability of 0.5 (or a 50% chance) could mean that the communicator is signaling that they have no confidence in their ability to guess the number at all: that it could be anywhere between 0 and 1 (Fischoff and De Bruine 1999). On the other hand, it could also mean that the value is somewhere between 0.45 and 0.55. We could call this strategy for communicating uncertainty imprecise.
Other examples of imprecise uncertainty communication include when a designer chooses to use visualization techniques that are harder for people to read. For example, the average person will be further off from judging the true data value when it is mapped to circular area (on the left) compared to when it is mapped to position along the y-axis as in a bar chart (on the right). Choosing a visual encoding that makes the user’s job of inferring the data values noisier — more error prone — can be seen as a way of suggesting to the viewer, “don’t take this data too seriously.” In visualization research, we call visualization techniques that make it harder to read numeric data values, like area, less effective (Mackinlay 1986), and we’ll return to this effectiveness criterion in future posts on uncertainty visualization techniques.
There are problems with choosing to use a less effective visualization technique as a way of conveying uncertainty, not unlike the problems with rounding. First, there’s no guarantee that any individual viewer’s error will be proportional to the amount of uncertainty that is intended. Even if a designer chooses an encoding based on estimates of average error from graphical perception experiments, individual differences in accuracy can mean that one viewer’s estimate nearly perfectly matches the true value, while another viewer’s is even further off than the intended amount of error. Since we usually evaluate visualization designs based on how well they perform for individual users, it’s hard to rationalize a design choice that depends heavily on results averaged across viewers. Second, there’s no guarantee that a viewer will recognize that the visual encoding is less effective, and therefore lower their confidence in their estimate of the value: they might feel very confident about a bad estimate.
A more precise, yet still subtle, approach is to present the point estimates, but with a margin of error. Here’s a visual example of how the New York Times mentions a margin of error with their political polling estimates:
It just kind of jumps out at you, right? RIGHT?
If you’re a skeptic you may be wondering: are people really going to take the margin of error into account? Walking away from reading this article, is the average user going to think about support for Slotkin as plausibly as low as 44% or as high as 53% (and even possibly outside of that range), or describe the results this way to a friend?
If we trust what decades of research in decision-making under uncertainty has to say, your skepticism is warranted. Many studies have shown that people rely on various strategies to avoid dealing with uncertainty information head-on, such as heuristics, which refer to mental shortcuts that allow a person to make a judgment under uncertainty more easily (see, e.g., the work of Kahneman and Tverksy or Gigerenzer and colleagues). This is not entirely surprising: uncertainty makes our judgments feel harder. When uncertainty is reported separately from a point estimate, as in the margin of error example above, it’s easy to ignore. We could call this strategy for communicating uncertainty dismissable. The ability for a user to dismiss uncertainty in making a decision will come up again in a later post when we talk about graphical annotations, like error bars.
This post laid the groundwork for understanding of what uncertainty typically refers to in a visualization context. We also saw some techniques that may reduce your effort in communicating uncertainty as a designer, but fail to guarantee that viewers will adjust their confidence proportionally to the uncertainty in the data. Next up we’ll take a tour of techniques that map probability density to visual variables like height, width, or opacity.
- ^ In reality, we don’t actually iterate over every possible value. We use computational methods that can give us the probability distributions we want or approximations to them. For the examples in this article, we used a relatively simple Bayesian structural time series model. Note that we are not experts in unemployment data: this model is intended for illustrative purposes only and should not be used for serious applications. You can see our methodology here.
- ^ For the purposes of this article, we are considering uncertainty — including epistemic uncertainty — to be represented by probability distributions. Different statistical schools might quantify the plausibility of particular parameters values differently—perhaps using a sampling distribution, a likelihood, or a Bayesian posterior—or reject the idea that such plausibilities can be assigned in the first place. We fall into the former camp, but have tried to remain agnostic to the particular way that uncertainties are derived (in a frequentist, likelihoodist, or Bayesian way) for the purposes of this article.
- ^ Sometimes called epistemic and aleatory uncertainty, respectively.
- ^ At least, in the Bayesian statistical framework we adopt here.
- ^ Sometimes called ontological uncertainty, in contrast to epistemic and aleatory uncertainty already described.