The real world is full of uncertainty, but it can be tough to communicate that uncertainty. This is especially true for data visualization, where the usual practice is to quantify uncertainty (turn it into a number somehow) and then encode uncertainty by visualizing it. This has to happen at the same time as figuring out how to represent the rest of the data. Uncertainty information inherently makes visualizations more complex: it’s more data to show, and uncertainty quantification can be a complex process that results in numbers that are difficult to interpret.
To make matters worse, one goal of showing uncertainty is to integrate the uncertainty into your decision-making process. That is, you may want people to be less confident in decisions based on highly uncertain data. If you don’t properly integrate uncertainty information, then you risk making the uncertainty ignorable, such that people will ignore the risks and variation in the data, and treat things that should have a lot of uncertainty (the outcomes of elections, the effectiveness of medication or diets, or the expected arrival of transportation) as certainties. You also don’t want to be too hasty: if there’s too much uncertainty, maybe the right decision is to wait until there’s more data, or refrain from making too strong of a prediction.
One strategy to make uncertainty unignorable is to use a “bivariate map.” Bivariate maps encode two types of data in the same visual channel. For example, Joshua Stevens’ Sasquatch map (Fig. 2) assigns a color to a U.S. county based on two variables: its population density, and its number of sasquatch sightings. These maps have been around for a long time (see Fig. 3), but they can be hard to interpret. The visual properties we use to represent sasquatch sightings can be difficult to disentangle from the colors we use to represent population density: we don’t perceive “40% green, 60% purple” very accurately when we look at colors! As such, bivariate maps usually limit themselves to a small set of outputs. There’s only 9 possible colors in the sasquatch map, and only 16 possible texture comparisons in the map in Fig. 3.
When we make a Value-Suppressing Uncertainty Palette, we decide to spend this limited budget of outputs in the service of integrating data and uncertainty. We give more distinct outputs to the bivariate map when the data are very certain, and fewer when the data is highly uncertain. VSUPs have an internal “tree quantization” scheme to determine which combination of data value and uncertainty value corresponds to which discrete color. When data is highly uncertain, then there’s only one output color. As certainty increases, this color has two “child colors” that divide the data domain equally, allowing us to distinguish high and low values from each other. As certainty increases again, each of these two children have two children of their own, chopping up the data domain into smaller and smaller regions, and allowing fine-grain distinctions as the level of certainty goes up. To drive this metaphor home, rather than the traditional bivariate square legend shape, we prefer to present VSUP legends as a pyramid or wedge (Fig. 4).
Fig. 1 shows an example of a VSUP designed to show polling data prior to the 2016 U.S. Presidential Election. If the candidates are very far apart in the polls, and the margins of error very low, then it’s responsible to talk about even minute differences in polling: a candidate leading 51% to 49% with very narrow margins of error is probably going to win in that state. As such, most of the colors are devoted to these highly certain values. As the margins of error gets bigger, speculating about these small differences becomes less responsible: we devote fewer and fewer colors to them, and states can only be said to lean in one direction or the other. If a candidate is polling within 2 margins of error from their opponent, everything gets mapped to the same “tossup” color. The VSUP suppresses uncertain values, discouraging viewers from making predictions about them.
VSUPs encourage people to be cautious about their judgments when uncertainty is high, but this is not always the behavior a designer might want to see: for instance, a highly uncertain but highly important “black swan” event might deserve high salience in the display, no matter the uncertainty information. Likewise, VSUPs rely on the designer to choose important levels of uncertainty, and what counts as “too uncertain to distinguish.” This definition might not be fixed, or could change over the course of the analysis session. In that case, the designer might want to allow some interactivity or filtering to reshape the VSUP and support new tasks.
There are more details about VSUPs, including an empirical evaluation of their effectiveness, in our paper repository. If you’d like to start making VSUPs for yourself, we’ve got a module that plays nice with D3.js!
This article was written by Michael Correll, Dominik Moritz, and Jeffrey Heer, describing a paper we presented at CHI 2018. For more, read the paper.