Uncertainty + Visualization, Explained (Part 2: Continuous Encodings)
Matthew Kay and Jessica Hullman (the MU Collective).
TLDR: The second in a series summarizing what we know about visualizing uncertainty in data (also check out the first post). This post covers continuous encodings for static charts and trade-offs to consider in using them, including the effectiveness of different visual encodings for supporting accurate perception and space trade-offs.
In the first post in this series, we considered what the term uncertainty usually means when we’re talking about visualizing uncertainty. We also reviewed a few subtle, yet ineffective strategies for communicating uncertainty — ineffective largely because they are easy to ignore.
One way to draw greater attention to uncertainty in data is to visualize it directly. What are some ways to visualize uncertainty? And how well do those approaches work for different tasks?
In this post we’ll talk about continuous encodings of probability distributions in static charts. In future posts, we’ll cover other techniques, like traditional interval-based approaches, and modern techniques like the use of discrete outcomes (also called frequency framing) and animation.
Using Visual Variables and Continuous Encodings
A direct way to convey a probability distribution is by mapping the probability density (the relative amount of probability) of different values to a visual property (or variable) like the height, width, color value, opacity, etc. of a mark. In visualization speak, we call these visual variables. A probability density function is a continuous function¹, so when we map probability density to visual variables we call this a continuous encoding. We’ve already seen examples of probability density plots in the first post in this series: these use the area under a curve to encode probability. Here’s the uncertainty in the U.S. unemployment numbers plotted using probability density plots at six month intervals:
If this looks a bit odd, that may be because we usually see density plots horizontally. Here, the vertical plots let us keep the same axes as the original plot. However there also many other ways to map probability onto visual variables:
Several of these other techniques are closely related to density plots. You may be familiar with histograms (not shown here), which use binning instead of a smooth curve. Violin plots mirror the density plot, using thickness to convey probability. Gradient plots are a more uncertain-looking twist on a violin or density plot, varying the transparency or lightness at any given value, such that a darker color corresponds to higher probability. The idea of making more certain values darker seems intuitive to many people, though there is not strong evidence that it improves understanding or decision-making.
Research in graphical perception has studied how well people can read data when it is encoded using different visual variables, termed effectiveness. The most common approach, used in canonical studies like that of Cleveland and McGill, has been to evaluate how well people can make proportion estimates, like What % of a larger value is a smaller value? To apply the results of such studies to uncertainty visualizations, we can consider what visual judgment each type of visualization requires of the viewer to infer the probability density. To read a density plot, a viewer needs only compare the position of the top of the density at two different points. A violin plot instead requires judging length, as the viewer must compare the horizontal width of two points. To read a gradient plot requires inferring values from lightness.
Plotting the density function can be effective when your audience is familiar with its meaning, and you have only a small number of distributions to display. Height or width are relatively effective encodings (Cleveland and McGill 1984), so they can provide a high resolution picture of the shape of the distribution. Because people are good at perceiving position, it is easy for them to find the mode of a density plot (it is the point with the highest density), and they likely can judge the mean (by visually averaging) and median (it splits the area in half) reasonably well too. Estimating the probability of a given region is a bit harder, because it requires estimating weirdly shaped areas, which we are not as good at as position.
Compared to density plots, it can be easier to plot many gradient plots simultaneously because they don’t require as much width and height as a density plot in order to be readable. In the following chart, we plot gradients that show the uncertainty in the employment rate for every month (note how much smoother the gradients below look compared to our density plots above, which show the uncertainty at only six-month intervals due to space constraints).
While we can easily plot more distributions using gradient plots than density plots, people’s ability to read quantities from a lightness encoding is not as good as position or area perception. When deciding whether to use gradient plots or density plots, we face a tradeoff between how much detail we can show in the time axis versus how much detail people can perceive about the uncertainty at any given time point. Which plot is more appropriate depends on what communication goals we have for the visualization: if the goal is simply to convey a coarse sense of uncertainty, a gradient plot might suffice. If the goal is for the user to be able to compare the relative amounts of probability density to make decisions, a density plot is more appropriate.
We also need to consider non-perceptual errors that people make when presented with probability density functions. For example, some people may not be familiar with standard statistical graphics, and therefore may not be sure how to interpret what they are seeing. The confusing nature of probability makes uncertainty visualization challenging, especially when intended for audiences who aren’t accustomed to working with data. We suspect that while many people may associate encoding like a density plot with statistics, they may not be sure what exactly it means. Whenever presenting uncertainty, it’s safest to include specific instructions for how to read the visualization.
If we think that people may be primarily interested in one-sided probability intervals (questions like, what is the chance that unemployment will be less than 4%?), we could instead plot the cumulative distribution function (CDF). Though the CDF can be unfamiliar to people, when the question that someone wants to answer is related to a one-sided probability interval, then a CDF may be the best choice: it replaces an area judgment (how much of the area in the density curve is under 4%?) or a lightness judgment (how much of the total “redness” of the gradient plot is under 4%?) with a length or position judgment (how high is the CDF at 4%?), and people are better at estimating position than area or lightness. If people are instead interested in the probability unemployment is greater than a particular value, we could also plot the complementary CDF (the CCDF) instead. This encoding can also be used to produce barchart-like plots² with uncertainty:
Some recent research from our lab examined how well a CCDF compared to more standard plots of the probability density function for helping people make decisions about when to leave for the bus, answering questions like, What’s the probability I’ll miss the bus if I arrive at the stop in 3 minutes? (Fernandes et al. 2018) We found that untrained users made better (more rational) decisions using a CCDF compared to a density plot (amongst other things).
By contrast, untrained users may find it more difficult to estimate location (mean, median, or mode) from a CDF compared to a density plot. The median is easy to find if you are trained — it is the point at 50% on the CDF — but other summaries, like mean and mode, are more difficult to find than with densities or gradients, even with training (Ibrekk and Morgan 1987).
We’ve just scratched the surface of possible uncertainty visualizations. In future posts, we’ll discuss frequency-framing uncertainty visualizations, which capitalize on people’s ability to understand uncertainty in terms of natural frequencies, and animated approaches like hypothetical outcome plots that help people understand uncertainty by experiencing it. We’ll touch on tasks not supported by the sorts of visualizations shown above, such as judgments involving multiple outcomes: e.g., What’s the chance the unemployment rate goes up in May and then back down in June? Finally, we’ll look beyond communicating probabilistic uncertainty entirely: How do we communicate uncertainty in how well our statistical model describes the world?