What’s visual ‘encoding’ in data viz, and why is it important?

Sophie Warnes
3 min readJun 24, 2018

--

I first came across ‘visual encoding’ a few months ago (even though I’ve been doing data viz for a few years). I couldn’t really get to grips with what it is until I changed how I thought about it and realised it was something I knew all along! It sounds much more technical and complex than it actually is and I think ‘encoding’ is really off-putting as a term.

Encoding in data viz basically means translating the data into a visual element on a chart/map/whatever you’re making. You need to do it right, because doing it right will mean that other people looking at your visualisations can understand what you’re trying to say or show.

Another way to think of encoding is as a set of rules to follow. So when you’re making them, you need to think logically about how you set up those rules, otherwise people will get totally confused. If you’re doing something complicated, a good way of helping yourself to think about how you encode, or set up the rules, is:

Every time <data changes in some way>, do <something visual>

This helps you to be consistent in how you apply the rules. Consider this very boring rubbish chart:

It’s a standard column chart, and most people wouldn’t have trouble understanding what it’s showing us — but when you break it down, this is encoded in a few different ways:

  1. Colour: Every time <category is bears/dolphins/whales>, change <colour of the column to be blue/orange/grey>
  2. Size: Every time <number goes up>, increase <column height>
  3. Grouping: Every time <month changes>, create <new cluster of columns>

As I said, these are basic things that everyone recognises, but I find it helpful to break it down. “Of course the column height increases when the number gets bigger!” you say, but actually some people try to get away with not doing that, because they want to downplay or alter your perception of the data for whatever reason.

Let’s try something else. This is a screenshot from an amazing interactive about Hamilton! the musical:

This is more complex and looks very different, here’s how it’s encoded:

  1. Colour: Every time <person> speaks, <colour changes>
    (So in this instance, Aaron Burr is purple, Alexander Hamilton is green, etc.)
  2. Area: The <longer the chunk of lyrics>, the <bigger the area of the circle>
    (I don’t quite know if it’s % of the song or exact number of lines, but in the second row on the left, the big red circle is King George, showing he basically sings the whole thing while the company sings Da da dat dat da ya da! behind him)
  3. Grouping: Group lines by song
    (Each cluster of circles is a different song)

Here’s a non-exhaustive list of ways you can encode data:

  • Size
  • Shape
  • Colour
  • Grouping
  • Area
  • Position
  • Saturation
  • Line pattern
  • Line weight
  • Angle
  • Connections

Hopefully… Hopefully this makes sense and is helpful, I have written it really quickly and perhaps chosen peculiar examples, but thinking about encoding in this way has really helped me to find clarity when I try to create something using data. Thinking about it logically, formulaically, and in a structured way means I’m more consistent (because I have RULES!) and I can easily explain what I’m trying to do to someone else, or to a reader.

--

--

Sophie Warnes

Data nerd and journalist— has probably worked at your fave UK paper. Unrepentant feminist. Likes: Asking irritating questions. Hates: Writing bios, pandas.