7 quick steps for effective plots with Matplotlib

Marc Eksteen
8 min readFeb 3, 2023

--

From start to finish :)

Need to rapidly upgrade your plot? I’ve written this guide in 7 simple steps.

It’ll be especially helpful if you are using Python and Matplotlib.

We’ve all seen hideous plots before. Some are legitimately terrible. Off-beat colours, dodgy axes, way too much information.

If you’re reading this guide, I suspect your plot doesn’t fall into that category. It’s probably a step up from there — it’s probably pretty acceptable, maybe just a bit clunky and lacking polish. You might be giving a presentation or a report, and hoping to make a good impression.

Maybe your plot looks a bit like the one below. We have some randomly generated sales data for three different stores, and a linear trend line for each store. Let’s pretend we’ll be using this plot in a presentation.

import matplotlib.pyplot as plt
import random

random.seed(0)

#generate some data
x_data = range(2000,2021,1)
y_data_s1 = [((x-2000)*5) + 100 + random.uniform(-50,50) for x in x_data]
y_data_s2 = [((x-2000)*9) + 100 + random.uniform(-50,50) for x in x_data]
y_data_s3 = [((x-2000)*3) + 100 + random.uniform(-20,20) for x in x_data]
#trend lines
y_data_r1 = [((x-2000)*5) + 100 for x in x_data]
y_data_r2 = [((x-2000)*9) + 100 for x in x_data]
y_data_r3 = [((x-2000)*3) + 100 for x in x_data]

#set up the figure
fig = plt.figure()
fig.set_figwidth(10)
fig.set_figheight(5)
#do some plotting
plt.scatter(x_data,y_data_s1,c="red")
plt.scatter(x_data,y_data_s2,c="orange")
plt.scatter(x_data,y_data_s3,c="blue")
plt.plot(x_data,y_data_r1,label="Store A",c="red")
plt.plot(x_data,y_data_r2,label="Store B",c="orange")
plt.plot(x_data,y_data_r3,label="Store C",c="blue")
#formatting
plt.legend()
plt.xlabel("Years from 2000 to 2019")
plt.ylabel("Store Sales ($ per year)")
plt.grid()

Let’s take this plot from a C+ to an A standard.

Step 1: remove unnecessary ‘ink’
Step 2: fix colour scheme
Step 3: boost line thickness
Step 4: raise font size
Step 5: revise labels
Step 6: review
Step 7: bonus

Step 1: remove unnecessary ‘ink’.

A supervisor taught me this trick a few years back. Identify and remove any ‘ink’ that fails to aid understanding. It will only distract.

Since we are just using this plot for a presentation, we can safely remove:

  • the plot box
  • the legend border
  • the grid
#remove the plot box:
ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

#no legend border
plt.legend(frameon=False)

#also make sure to remove plt.grid()

Looking a bit better.

Depending on the circumstance, there are plenty of other common items we can safely omit. Only put something on the plot if it aids understanding.

Step 2: pick a simple, distinguishable colour scheme.

Colour is essential for effective plots. A truly effective plot should ‘upload’ much of its information into the viewer’s brain before they’ve even really processed it. Hence removing unnecessary ink, and why we now want to perfect the colour scheme.

Colour theory is a big topic. Generally, the colour palette you use depends on the plot. Here are three palettes that are common for charts:

  • Qualitative palette: bright, punchy colours which stand out from one another. Usually used for bar charts, line plots, etc.
  • Sequential palette: usually ranges from white or black to a bright primary colour (e.g. red). Used for maps, heatmaps, confusion matrices, tables and so on.
  • Diverging palette: a palette usually ranging from one primary colour to another. E.g. from blue through white to red.

An excellent website for picking chart-worthy colour palettes is ColorBrewer.

If you’re looking to get a bit more creative, there are other colour palette tools including this color wheel by Adobe and this suite of resources by Canva.

For our dataset, a qualitative colour scheme seems most appropriate. Why? Because we want our viewers to easily differentiate between the different stores on our plot. So let’s pick some colours that fit with that schema, and we’re away :)

This is based on one of the diverging colour palettes from ColorBrewer
plt.scatter(x_data,y_data_s1,c="#7570b3")
plt.scatter(x_data,y_data_s2,c="#1b9e77")
plt.scatter(x_data,y_data_s3,c="#d95f02")
plt.plot(x_data,y_data_r1,c="#7570b3",label="Store A")
plt.plot(x_data,y_data_r2,c="#1b9e77",label="Store B")
plt.plot(x_data,y_data_r3,c="#d95f02",label="Store C")

By removing the super punchy colours we started with, the plot is looking a bit flat. So the next step will really emphasise the parts of the plot that matter.

Important side note 1: if you have multiple plots, you should carefully consider the colours you use across the plots. You could keep the palette consistent, or make it distinct for each plot. Put yourself in your viewers’ shoes.

Important side note 2: remember that not everyone’s eyes work the same. While green and red may be easy for you to differentiate, colour-blind people may struggle. Blue and yellow is also sometimes an issue. Consider using colour-blind friendly colour schemes. The ColorBrewer website can help you with that.

Step 3: increase line thickness or bar size

Again, assume your viewers are sitting a bit too far away from your presentation, or that the plot has printed out a bit too small on your report. Make any important lines big and bold and easy to see.

plt.scatter(x_data,y_data_s1,c="#7570b3")
plt.scatter(x_data,y_data_s2,c="#1b9e77")
plt.scatter(x_data,y_data_s3,c="#d95f02")
plt.plot(x_data,y_data_r1,c="#7570b3",label="Store A", linewidth=4)
plt.plot(x_data,y_data_r2,c="#1b9e77",label="Store B", linewidth=4)
plt.plot(x_data,y_data_r3,c="#d95f02",label="Store C", linewidth=4)

Step 4: boost the font size

Without seeing your plot, I can probably say your font size is too small. A supervisor taught me this a while back, and I now see the truth in it.

Look at any plot in a highly-regarded scientific journal, and you’ll notice the font size is massive with respect to the size of the plot.

Again, we are striving for instant readability, without the viewer having to deeply process. So let’s do it: first increase the font-size, and then adjust the padding to ensure the labels stay away from the axes:

#some font sizes
size1 = 16
size2 = 20

#set default font size
plt.rc('font', size = size1)

#smaller font
plt.rc('xtick', labelsize = size1) #x-tick label
plt.rc('ytick', labelsize = size1) #y-tick label
plt.rc('legend', fontsize = size1) #legend

#larger font
plt.rc('axes', labelsize = size2) #x and y axis labels

#padding
plt.xlabel("Years from 2000 to 2019", labelpad=10)
plt.ylabel("Store Sales ($ per year)", labelpad=10)

#plot margin
plt.subplots_adjust(bottom=0.15)

There’s no exact formula, but as a general rule, bigger is better. You’ll lose far more by having tiny font than massive font.

You’ll notice though that the x-axis is very crowded. Let’s address that next.

Step 5: revise axis and tick labels

Make sure also that the x- and y-tick labels are neat and tidy and fit for purpose. If your plot needs very fine detail, consider using more tick labels. Otherwise, consider using less.

Also, as a general rule, your axis titles should be short and to the point. Make sure that your axis titles clearly identify the content of the plot, and the units of measurement (if needed).

#axis ticks
ax.set_xticks([4*x+2000 for x in range(6)])
ax.set_yticks([40*y+80 for y in range(6)])

#simplify x label
plt.xlabel("Years",labelpad=10)

Step 6: give your plot to a coworker or friend, and get them to explain it to you.

This step can be incredibly valuable. A plot that makes perfect sense in your brain, might be deeply confusing to someone else.

A quick feedback session with a friend or coworker can identify any points of confusion. This will help you improve the plot itself, and also help you refine your talking points for your report or presentation.

Remember this: a plot is rarely intended to be a dump of random information. Plots tell stories.

Step 7: bonus points

Remember that effectively every element of a Matplotlib plot is customizable. With enough effort, you can customize the plot to look however you like.

Presenting a slideshow on a dark background? Consider changing the plot background to match. You can achieve very slick presentations if you match your plot to the presentation style. For example:

#set background
fig.patch.set_facecolor('#302C30')
ax.set_facecolor('#302C30')

#set font and axes to white
p = {"axes.labelcolor" : "white",
"axes.edgecolor" : "white",
"ytick.color" : "white",
"xtick.color" : "white"}
plt.rcParams.update(p)

#update legend
for t in leg.get_texts():
t.set_color("white")

And on the slide show itself:

All done :)

From start to finish

Producing good plots is an art form, and definitely open to interpretation. I’m still learning, and these are just some tips I have accumulated over the years. Hope they helped :)

-Marc

Final code:

import matplotlib.pyplot as plt
import random

random.seed(0)

#generate some data
x_data = range(2000,2021,1)
y_data_s1 = [((x-2000)*5) + 100 + random.uniform(-50,50) for x in x_data]
y_data_s2 = [((x-2000)*9) + 100 + random.uniform(-50,50) for x in x_data]
y_data_s3 = [((x-2000)*3) + 100 + random.uniform(-20,20) for x in x_data]
#regressions
y_data_r1 = [((x-2000)*5) + 100 for x in x_data]
y_data_r2 = [((x-2000)*9) + 100 for x in x_data]
y_data_r3 = [((x-2000)*3) + 100 for x in x_data]

#some font sizes
size1 = 16
size2 = 20

#set default font size
plt.rc('font', size = size1)

#smaller font
plt.rc('xtick', labelsize = size1) #x-tick label
plt.rc('ytick', labelsize = size1) #y-tick label
plt.rc('legend', fontsize = size1) #legend

#larger font
plt.rc('axes', labelsize = size2) #x and y axis labels

#set-up the figure
fig = plt.figure()
fig.set_figwidth(10)
fig.set_figheight(5)

#plot the data
plt.scatter(x_data,y_data_s1,c="#7570b3")
plt.scatter(x_data,y_data_s2,c="#1b9e77")
plt.scatter(x_data,y_data_s3,c="#d95f02")
plt.plot(x_data,y_data_r1,c="#7570b3",label="Store A", linewidth=4)
plt.plot(x_data,y_data_r2,c="#1b9e77",label="Store B", linewidth=4)
plt.plot(x_data,y_data_r3,c="#d95f02",label="Store C", linewidth=4)

#labelling
plt.legend(frameon=False)
plt.xlabel("Years from 2000 to 2019",labelpad=10)
plt.ylabel("Store Sales ($ per year)",labelpad=10)

#remove plot box:
ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

#adjust axis titles and tick labels
ax.set_xticks([4*x+2000 for x in range(6)])
ax.set_yticks([40*y+80 for y in range(6)])
plt.xlabel("Years",labelpad=10)
plt.ylabel("Store Sales ($ per year)",labelpad=10)

#adjust plot margins
plt.subplots_adjust(bottom=0.15)

#OPTIONAL - custom background colour - uncomment this section
# #set background
# fig.patch.set_facecolor('#302C30')
# ax.set_facecolor('#302C30')

# #set fonts and axes to white
# p = {"axes.labelcolor" : "white",
# "axes.edgecolor" : "white",
# "ytick.color" : "white",
# "xtick.color" : "white"}
# plt.rcParams.update(p)

# #update legend
# for t in leg.get_texts():
# t.set_color("white")

--

--

Marc Eksteen

Data analyst, somewhere. Finding my way through life, one day at a time :)