A Practical Introduction to Colors in Python

Vincent Lonij
6 min readMay 24, 2018

--

Using the right colors can make your visualizations spring to life. However, it can be quite hard to pick good colors, and, even if you know which colors you want to use, implementing the desired effect can be tricky.

Choosing a good color palette is a science in and of itself and some great resources about that topic are available (listed at the end of this post). This post focuses on what to do once you have chosen the palette you want.

The matplotlib package has excellent support for managing colors and plots, however, it can sometimes be a bit tricky to find out how to combine the various different classes/functions in this package. In this tutorial we will go over a few examples. As usual, all the code I used is in this git repository and you can follow along in the Jupyter notebook here.

For this tutorial, we will use two data sets related to international trade. One data set, provided by the Danish Maritime Authority, shows the location of ocean traffic around Denmark. A second data set, provided by the US Census Bureau, shows imports and exports over the last 20 years from/to the USA.
I’ve done some data pre-processing, and details can be found in the git repo for this tutorial.

Plots with default values

Since this tutorial is about colors, I will gloss over details of data loading and formatting, the code for that can be found in the Jupyter notebook.

I’ve created a numpy array with ocean traffic density called values and a Pandas DataFrame with imports called line_data.

Let’s start with some basic plots. To make a density plot from a numpy array we can use the imshow function from matplotlib:

import matplotlib.pytplot as plt# define geographic extent of the image in latitude/longitude
bounds = [10.0, 11.0, 57.0, 58.0]
im = plt.imshow(np.log(data+1),
origin='lower',
extent=bounds)
plt.colorbar(im)

Notice that we plot the logarithm of the density. More on that below. Similarly, we can make a line plot of the US Census data.

for year in sorted(data.year.unique(), reverse=True):
year_data = data.loc[data.year==year,:]
plt.plot(year_data['month'],
year_data['imports'],
lw=2,
label=year)

I’ve also added some axis labels and a legend, and the result will look roughly like this:

Plots with default color palette.

The default plots already look pretty good, however, we can see a few problems. In the case of the density plot, I had to plot the logarithm of that density in order to be able to see the full range of data. However, the color bar in the legend also shows the logarithm of the density, which is not what we want. Furthermore, the vast majority of the plot ended up being roughly the same color.

In the case of the line plots, it’s a bit hard to identify which line corresponds to which year because colors are reused. Furthermore, the color palette of the line plot is different from the color palette of the density plot, which makes the plots look disconnected.

Adjusting existing color maps

Let’s start with the density plot. The density of ocean traffic is represented in our data as a single value for each latitude/longitude coordinate. Converting that value to a color requires two steps:

  • We use a normalizer to convert the density value to a number in the range 0 to 1.
  • We use the color map to convert that normalized value to three numbers representing red green and blue (i.e., an RGB value).

Matplotlib has several classes for that. To normalize our density values, matplotlib provides both linear and nonlinear normalizers. For example, we can use the Normalize class to apply a linear scaling, or we can use classes like PowerNorm or LogNorm to use nonlinear scaling.
We will talk more about color maps in the next section, so for now we will use an existing color map that we retrieve using the function get_cmap.

from matplotlib.cm import get_cmap
from matplotlib.colors import PowerNorm
cmap = get_cmap('inferno')# we can adjust how our values are normalized
norm = PowerNorm(0.4, vmin=0, vmax=100)
# the colormap and normalizer can be passed directly to imshow
im = ax.imshow(data,
cmap=cmap,
norm=norm,
origin='lower',
extent=bounds)
# create a color bar with 10 ticks
plt.colorbar(im,ticks=np.arange(0, norm.vmax, norm.vmax/10))

Here’s what that looks like for different values for the power and vmax settings of the normalizer.

Exploring different scaling options.

Notice that the color bars now show the actual values. Lets apply the same color map to our line plot. We’ll use a linear normalization, but this time we need to generate our colors manually:

line_norm = Normalize(1980, 2010)for year in sorted(data.year.unique(), reverse=True):
year_data = data.loc[data.year==year, :]

norm_value = line_norm(year)
col = cmap(norm_value)

ax.plot(year_data['month'],
year_data['imports'],
lw=2,
c=col,
label=year)
# set the background and grid colors
plt.gca().set_facecolor('#111111')
plt.grid(color='#555555')

Viewing the density plot and line plot side-by-side, we get this:

Density plot and lineplot with the same palette.

Ok, that’s a little better. At least the plots now look like they were both made by the same person. There is still a lot to quible about in these plots, but at least we now know how to control that. Next, let’s make our own color map.

Custom Color Palettes

There are many reasons to choose a color palette. Clarity is an important one, and a lot has been said about how to use color to make plots more readable. matplotlib provides many predefined colormaps to help with this. Style is also important. Maybe you feel like ‘inferno’ does not quite send the right message for the data you are presenting or maybe you just want to conform to your company’s in-house style. To do that, we need to define our own color map.

It’s probably not a good idea to pick arbitrary colors for your palette. Lucky for us there are a several great tools to help you choose a palette like colorbrewer and paletton.

As mentioned before, the Colormap class converts a number in the range from 0 to 1 into an RGB triple. To build such class, matplotlib has a nice utility class called LinearSegmentedColormap. We can instantiate this class by providing a list of colors; matplotlib then takes care of interpolating those colors for any value in [0, 1].

# blue -> green -> yellow
colors = ['#225ea8', '#41b6c4', '#a1dab4', '#ffffcc']
# create color maps and normalization
density_cmap = LinearSegmentedColormap.from_list('my_cmap', colors)
density_norm = PowerNorm(0.5, vmin=0, vmax=100)
line_cmap = LinearSegmentedColormap.from_list('my_cmap', colors)
line_norm = Normalize(vmin=1994,vmax=2010)

Using these color maps and normalizers as before, we get the plot below.

Or, if you prefer something that works with a white background, we could try this.

# white -> blue -> green for the density plots
colors = ['#ffffff', '#0C4DFF','#64FF00']
density_norm = PowerNorm(0.5, vmin=0, vmax=100)
density_cmap = LinearSegmentedColormap.from_list('my_cmap', colors)
# blue -> green for the line plots
line_norm = Normalize(vmin=1994,vmax=2010)
line_cmap = LinearSegmentedColormap.from_list('my_cmap', colors[1:])

You can view the complete code of all these examples in this Jupyter notebook.

Further reading

As I mentioned before, choosing or designing a good color map is a science in itself. If you want to learn about how the default color maps in matplotlib are designed, you can read up about perceptually uniform color maps. Also, here is a post about general advice about use of color in data visualization. More links to data and documentation are in the Jupyter notebook.

Happy coloring!

Learned something? Click the 👏 to say “thanks!” and help others find this article.

--

--

Vincent Lonij

Tech + humans = ? Interested in how people and technology interact to lead to surprising outcomes. CEO at swyg.com