Simulating Visualisations in R for Colour Blindness

Indeep Singh
Version 1
Published in
10 min readFeb 3, 2021
Photo by Christina Rumpf on Unsplash

Data is well analysed and observed when presented through charts and figures. Correct visualisations of your data points can turn your raw data into insightful and intelligent information. But visualisations have one very important aesthetic i.e. Colours. With colours, visualisations bring insights more clearly. We have several choices for choosing colours, i.e. Qualitative Colours and Quantitative Colours, for our plots depending on the type of information we want to present.

Most of us as Analysts and BI developers ignore some constraints while choosing colours for our visualisations. When we generate plots, we choose colours which looks attractive to us or are more appealing than other ones like more contrasting, light-and-dark shades, etc. However, we presume that we are contemplating the maximum amount of our audience requirements but are we considering the hidden issues that someone from the audience might face in perceiving our plots due to the colours we choose. These hidden issues can be experienced by a person with Colour Blindness.

Colour blind person will encounter challenges in distinguishing some of the colours that we apply in our visualisations. And because of this issue, that person might not be able to grasp the intuitions which we want to deliver.

This blog is targeted to those developers who are employed in generating figures and plots for their end customers or project clients. After going through this blog, we should have a different perspective while choosing colours for our charts.

Objective

In this article, we intend to illustrate how we can reproduce visualisations in R for colour blindness. We will see how colour blindness affects people capability to infer information from visualisations, types of colour blindness that can exist in humans. In addition to this, we will also cover creating plots in R using colour blind friendly libraries.

What is Colour Blindness?

Colour Vision Deficiency, also known as Colour blindness, is a form of diminished capability to distinguish between specific colours. There are 2 types of cells in retina that detect light i.e. Rods and Cones. Cone cells are the one responsible for detecting colours and its types are Red, Green and Blue.

Cone cells in eye
Figure 1. Cone cells in eye . Source

Colour blindness can happen if one or more types of cones cells are not working properly or are absent. So, there can be a case of mild colour blindness where any one type of cone cell is not functioning appropriately or a severe case where all the three-cone cells are not performing.

It is approximated that 1 in 12 men (8%) and 1 in 200 women (0.5%) suffer from some sort of colour vision deficiency.

Let’s discuss some common types of Colour Blindness

Protanomaly
Figure 2. Protanomaly . Source

Protanomaly, or Protan Colour Blindness causes a person to have trouble differentiating blue and green and red and green colours.

Deutanomaly
Figure 3. Deutanomaly . Source

Deutanomaly, or Deutan Colour Blindness causes a person to have confusion identifying colours such as green and yellow, or purple and blue.

Tritanomaly
Figure 4. Tritanomaly . Source

Tritanomaly, or Tritan Colour Blindness causes a person to have confusion with blue and green, or yellow and violet.

There is one more rare condition of colour blindness, which is Monochromacy and Achromatopsia which means “no colour”.

How do colour blind people see colours?

Below figure depicts how people with normal colour vision and colour deficit vision observes colourful images.

Color blind variants of a colourfull image
Figure 5. Color blind variants of a normal colourfull image . Source

Till now, we have gone through some concepts related to Colour Vision Deficiency. Let us now dive into the main agenda of this blogpost i.e. How can we simulate plots in R which are well visualised by colour blinds as well as normally visioned people?

In further sections, we tend to illustrate the simulation of plots that are visually correct to colour blinds as well as people with normal vision. And how we can we use some specific colour sets/palettes to generate these kinds of plots in R.

We will be simulating 3 figures which will use different colour palettes — 1. Default ggplot colours, 2. Plot with Okabe and Ito colour palette, and 3. Colour palette generated using Viridis library.

A quick introduction to Okabe and Ito colour palette, they have suggested a set of colours that are friendly to both colour blinds and non-colour blinds. Below is the colour palette that is suggested by Okabe and Ito.

Colour palette suggested for Colour-blinds and Non colour-blinds
Figure 6. Colour palette suggested for Colour-blinds and Non colour-blinds . Source

We will be using mtcars dataset which is inbuilt in R. The dataset contains multiple aspects of 32 automobiles in terms of their performance and design. Below is the snapshot of how this dataset can be loaded into the R environment and get a peek into the top 6 rows.

Loading mtcars and Displaying top 6 rows from it

Next, we need to install and load some R libraries to achieve our final objective. Below mentioned are the list of libraries we will be needing:

1. ggplot2 is a famous R library to generate plots.

2. dplyr is a R package for working with data frames.

3. colourblindr is a library that helps to imitate the effect of colour blindness in figures. We can use functions provided by this library in order to create the effect of various types of colour vision deficiency.

4. viridis is a library which provides colour maps/palettes that can be used in R plots. The colour maps provided via viridis are designed in such a way that they are perceived well by the readers with normal vision and all forms of colour deficit vision.

Installation steps of above-mentioned libraries are given in their official documentation page which are linked with each library name. Next step should be to load these libraries after installation in the R environment.

Loading libraries

Till now, we have the dataset and the required libraries that we need for performing the simulation of R plots. We just need to make a slight change in the dataset. Change the type of one of the columns of mtcars to factors using the below mentioned code in the screenshot.

Modifying column type in mtcars dataset

We need to perform this step because the column ‘cyl’ is marked as continuous variable and it has only 3 values i.e. 4, 6 and 8. Since we will use this column to colour and shape the data, we need to have it as factor variable.

Now, we will start with generating our first plot with the default colours using ggplot2.

Plotting with default colour palette in R ggplot

Let us go through the code once. So, we are defining the ggplot figure aesthetics as ‘mpg’ column on x-axis, ‘wt’ column on y-axis, and ‘cyl’ column for shape and colour of the data points. Next, we gave the title to the plot using ‘ggtitle’. We also defined the size of data points using ‘geom_points’. At the end we defined some theme attributed in order to change the background colour and draw major grid lines. After running the above code snippet, we can see the plot as

Plot1 with default ggplot colours

In Plot1, a person with normal vision can see distinctive colours for automobiles with cylinder 4,6 and 8.

But have you ever wondered how a person with colour blindness will see these? Let us simulate that one now. We will use the function ‘cvd_grid’ in the library ‘colourblindr’ to reproduce Plot1 for colour blind people. Below is the code snippet to perform this step. Just send the ggplot variable to the functions and its done.

Code snippet to reproduce colour blind grid

Above command will generate plot grids as shown below for various types of colour blindness.

Plot 1 colour blind variants
Plot 1 colour blind variants generated using colorblindr library

Above figure shows the obstacles for a person with colour blindness. Deutanomaly and Protanomaly show the colours as almost the same except forsome difference in contrast value. Due to either­ of this colour vision deficiency, a person will have a problem identifying values for automobile with 4- and 6-cylinder. In case of Tritanomaly, the person will face challenges distinguishing data points for 6- and 8-cylinder automobiles. In the end, Desaturated colour blindness or Achromatopsia will cause a person to see no colours at all or in this case, all the data points as same colour i.e. grey.

So, to conclude Plot1 in normal vision and colour-blind vision, we can very easily spot the problems faced by people with colour vision deficiency. While with normal vision the colours in Plot1 were very much appealing and contrasting within separate data points and types. However, we can see the problems faced by colour blinds. If we do not have a different shape for different cylinder types, then it will be very difficult for a person with colour blindness to differentiate between different data points.

Now, as we are aware of the problem with choosing common/default colours for plots generated in R. Let us now explore 2 solutions to this problem using plots with different colour palettes along with their colour-blind variants.

Plot 2 will use colour blind friendly colour palette proposed by Okabe and Ito. In this plot we will choose any 3 colours from the palette suggested and then produce their equivalent colour-blind variants.

Here is the code snapshot for generating Plot2 with 3 colours from Okabe and Ito colour palette.

Plotting using colours from Okabe and Ito colour palette

The above code snippet will generate a R plot as shown below.

Plot 2 using Okabe and Ito suggested colour palette

Here Plot 2 shows very distinctive colours for a normal visioned person. Let us generate the colour-blind variants of this plot and then decide whether the suggested colour palette by Okabi and Ito is useful or not. The step to generate the colour-blind variant for this plot is same as Plot1. Use cvd_grid function and send the Plot2 variable in the script to this method. This will produce a plot grid as shown below.

Plot 2 colour blind variants

When we produced the Plot 2 using Okabe and Ito colour palette its colours were good enough to understand and differentiate between various classes of cylinders in automobiles. Whereas such cannot be said about its counter colour-blind perspective. Although the Deutanomaly and Protanomaly rendition can be said to provide a clear visual. The Tritanomaly may incur difficulties in understanding for 4- and 6-cylinder data points. And as for the Desaturated version, data points can be distinguished but not easily.

In Plot3, we will use colour scale provided by Viridis library. It provides 2 different functions for choosing colours for qualitative (discrete) and quantitative (continuous) values. scale_colour_viridis_d and scale_colour_viridis_c respectively. In our scenario, since ‘cyl’ column is qualitative we will use scale_colour_viridis_d function for our plot. Below is the code snapshot which is used for producing Plot3 using Viridis qualitative colour scale.

Plotting using colours from Viridis qualitative colour scale

The plot produced using the above code snippet is shown in below figure. Here, a person with a normal colour vision can clearly depict and differentiate the values of the different colours for each data points.

Plot 3 using Viridis colour scale

Let’s have a look at its equivalent colour-blind variants.

Plot 3 colour blind variant

Plot 3 colour blind variants are the most clearly distinguishable one. All the 3 colours assigned by Viridis library are very well depicted to all types of colour blindness.

What we’ve seen so far?

Colours play a very important role as data visualisations. We must be extremely careful when choosing colours. As what might look attractive to us being a normal visioned human, might not be a good idea for the one who is having colour blindness. We have seen how colours in a plot are perceived by people with different types of colour vision deficiency. But there are some colour-blind friendly colour palettes and R packages which can colour your R plots in a way that are well-differentiated and understood by the colour blinds as well.

With this, I bring this post to the end. I hope I have covered sufficient content in order to convey the message which I wanted to while starting the post.

Please let me know your valuable thoughts and feedback.

Cheers!

About the Author

Indeep Singh is a Data Analytics Consultant, currently working in Version 1’s Data Analytics practice. Be sure to follow Indeep for more blogs on data analytics and data visualisation.

--

--