Color is one of the most common ways and effective ways to convey information through data visualization. However, it can also be an easy way to confuse the reader through irresponsible use of color. Here, I’m going to be outlining a few common ways of misusing color in data visualization.
Color Coding Too Many Items
One common mistake when using color to communicate data is encoding too many data points in different colors. Take the figure below found from Datawrapper’s Blog, “Chartable.” The first example in this image shows a graph comparing two different datasets with 13 different letters. Although each letter uses its own color to indicate the value, there are so many colors and values that it requires the reader to constantly check back and forth to ensure that the color they are looking at in the key is actually the color they are looking at in the data set.
In his book, “Fundamentals of Data Visualization,” Wilke has this to say about the topic.
“As a rule of thumb, qualitative color scales work best when there are three to five different categories that need to be colored. Once we reach eight to ten different categories or more, the task of matching colors to categories becomes too burdensome to be useful, even if the colors remain sufficiently different to be distinguishable in principle.” (Wilke, 2018)
Which this is mind, it would be better to not use a qualitative color scale (such as the one below), and instead to use a different method to compare the values. If you still wanted to use qualitative color in the second example below, you could darken the values that indicate greater value. This helps communicate the purpose of the graph without confusing the reader with unnecessary colors.
Using Arbitrary Color Values
Another common pitfall of color in data visualization comes from using arbitrary color values when assigning colors to data points. In the example below, you can find a chart that Wilke uses in his book to exemplify arbitrary color values. This graph uses a rainbow scale to show the difference between 0% — 100%.
The immediate issue that stems from the choice to use a rainbow scale in this graph is the lack of intuitive knowledge that stems from the colors. Without looking at the key, it would be impossible to understand the curve of the values that these colors represent.
This is because the scale chosen does not indicate which values are distant from each other, and which are close. Again in his book, Wilke discusses two important conditions that all color scales must meet when representing sequential values.
- Colors should indicate which values are greater or lesser than the other values.
- The difference between colors should represent the difference between values.
The easiest way to see the value of colors is to look at the greyscale version of the scale. In the figure below, you can see the original rainbow scale, the greyscale version, as well as a standard greyscale scale. Take note of the clear difference in how the value of the color in the standard greyscale present data that meets the conditions, and how the rainbow scale does not.
Not Designing for Color-Impaired Users
The third common pitfall for when designing with color is not taking colored-impaired users into consideration. Some scales, such as the sequential scale, will not cause problems for color-impaired users. This is because a sequential scale that is properly designed will have a clear dark to light gradient that should be easy to read, regardless of the colors used. However, a scale such as a diverging scale can cause very serious issues for a colorblind user. In order to understand how we can approach designing for colorblind users, it’s important that we first understand what types of impairments we might be designing for.
The three traditional forms of colorblindness are protanope, deuteranope, and tritanope. The figure below shows the difference when viewing a rainbow scale for the three different types. (Okabe, 2002)
At a glance, it seems impossible to create palettes that will work for all colorblind people. However, tools such as ColorBrewer2 should be utilized to create safe palettes for all users. Below, you can find a palette created with ColorBrewer2 that is safe for all color-impaired persons. Many other colors schemes can be used in place of this one.
It is our responsibility to ensure that our choices follow function and are made to ensure that our communication is effective.
Overall, color can be one of the clearest and most guiding resources when designing for data visualization. However, as designers, it is our responsibility to ensure that our choices follow function and are made to ensure that our communication is effective.
Chartable. (2018). What to consider when choosing colors for data visualization. [online] Available at: https://blog.datawrapper.de/colors/ [Accessed 11 Dec. 2018].
Okabe, M. (2002). Color Universal Design (CUD) / Colorblind Barrier Free. [online] Jfly.iam.u-tokyo.ac.jp. Available at: http://jfly.iam.u-tokyo.ac.jp/color/#see [Accessed 11 Dec. 2018].
WILKE, C. (2018). Fundamentals of Data Visualization. [S.l.]: O’REILLY MEDIA.