Modeling Color Difference

seansudol
CUNY GC Data Visualization
4 min readMay 10, 2023

Have you ever had to adjust your screen brightness to better see the contrast between colors of a data visualization? Maybe had to adjust the angle or change device to get a better look? For a data visualization designer, this might be a better case scenario — at least the viewer has the awareness to make adjustments and some tools to get a better view of the designer’s intent. One worse possibility is that the viewer goes blissfully unaware of any color difference at all.

Of course, designers spend a lot of resources plotting color difference as part of their data visualizations, but how could they plan for all particularities, such as those above? One’s audience of potential viewers is certain to have a different devices of varying screen quality. Even the same viewer might have different viewing experiences of the same visualization if their device is in a less optimal environment at the second viewing. How does a data visualization designer account for all these potential contingencies?

In “Modeling Color Difference for Visualization Design”, Danielle Albers Szafir presents models that can be used to encode colors with probabilistic bounds that allow the designer to quantitatively evaluate the effectiveness of color differences for user perceptions. In doing so, the author’s design methods address common assumptions of design, and the result is an effective contribution to the field that can easily be used by designers to make visualizations more salient to viewers no matter their device quality or environment.

To do so, the author designed an experiment that presented subjects with a series of visualizations, created with D3, and measured each visualization’s discriminability rate, or ability to be discerned as different. Participations were excluded if they were color blind or if they failed exaggerated color scaling tests. The study included very common visual methods: scatterplots, bar plots, and line graphs, and avoided area-plots such as maps because of their large degree of variability in shape. Independent variables included mark size, color difference, and tested color axis (L*, a*, or b*).

Example test visualizations, with variations in size and test colors.

Each test was designed to ascertain whether a subject could ascertain a difference in color between two test marks. Original test marks used 79 colors, and to make the comparison test mark color for each visualization, the color was systematically modified from the original mark’s. Position differences were held constant for test marks at a standard distance for viewing comfortably, and distractor marks that overlapped a test mark were removed to prevent occlusion. These distractor marks all used the same gray color to focus the reader on color differences of the test marks.

One major finding of this experiment is that peoples’ abilities to perceive color decreases as mark size decreases. Additionally, participants identified color differences more effectively with variations on the L* axis than on the a* or b* axes. Both findings were true for all mark types considered in the study. Colors on marks of extended length, the bar and line graphs, were more discriminable than those of scatterplots, even when the thickness of the marks was equal.

Designers can apply the models devised in this study to make color choices, setting an optimal color palette that maximizes discriminability with minimal variation particularly attuned to differences in mark size. Even when these variations in size are determined largely by the scope of the data, as is the case for points and bars, these models can be used to evaluate and refine color palettes to better guarantee a user views a color difference and understands its intent.

Comparing color encoding, with marks using the author’s model on the right

The results of the study are limited by the experiment’s design. To make the perception task simple, the author limited the variables by which they altered the visualizations. However, under real world conditions, visualizations vary in many more ways than the author allows. Many will include more than one color hue, and some will include more than one mark type. One important findings of the study, that color perception varies by size, could be different under conditions where color and size both vary in the same visualization. Additionally, the authors note that, while their design is meant to address the myriad ways that a user might view a visualization by not selecting for participants on display environment, the study could be improved by testing visualizations with different displays under varying conditions.

Despite this need for future work, the author has developed a useful model that can easily be deployed by data visualization designers to systematically improve the discernability of their color variation encodings.

--

--