I had a short exchange with Andy Kirk when he tweeted something that reminded me of one of my pet peeves.
A basic choropleth map is one that breaks down geographic areas — like counties — and colors them in with a color that represents a value range for whatever data you’re mapping. They look more or less like this:
Sidebar: I should note that this post isn’t at all a knock at the GIS specialist who made these maps. She built and maintained a great mapping system for data she was given, and did her job very well. These are reflections on mapping in general.
Choropleth maps are often used to show data that have a geographic component (like the above map, of cancer rates by county in Minnesota). But does that geographic component add anything when it’s visualized? And is it worth the drawbacks?
Let’s take a look at some drawbacks of a map like this:
It lowers the resolution of the data: A choropleth map compresses data values into categories — the map of cancer rates above shows 4 categories of rates. When we un-compress the data and show the value, we see something more.
Here’s the same data presented another way (with a simple default charts from Excel). We see that a lot of counties have very similar cancer rates, but there are a few clear outliers, both high and low. This information is not available in a choropleth map.
It doesn’t provide geographical information: The map of cancer rates doesn’t provide a spatial pattern. Things look kind of randomly or haphazardly distributed around the state. For that reason, showing the data on a map isn’t a strategic improvement over showing the data in a chart.
Obviously, it might take some care and thought to display a bar chart with 87 different values — this would challenge you to know what story you’re trying to tell and to reinforce that story through design.
It’s complicated: Mapping data likely requires some specialty skills — whereas charting is available to most people who can google and use Microsoft Office.
I’m not arguing that maps aren’t helpful or compelling ways to present data — but I think that there are some criteria that can help determine when they are, and when they’re not.
When there’s a geographic change or flow to the data: By comparison to the map of cancer rates at the top of this post, take a look at this map of radon values (snipped below).
This map does communicate spatial information beyond the location of the counties. It shows a clear geographic pattern from the southwest that aligns with the Des Moines lobe till: soil that was deposited by glacial movements and produces radon.
When you communicate many layers of data: I went looking for a map of asthma rates and highways, and while I couldn’t find what I was looking for, I did find this paper that includes the maps below. I can imagine an online interactive system that lets a user browse these layers in combination with the asthma data.
When you expect your users to look something up based on their geography: The format is the content, too. A system that allows or expects users to click their location — rather than type in their location or worse, select it from an interminable drop-down menu — is a good use of a map, both for initial selection and subsequent presentation.
Locating data in its place can help ground the values in a user’s mind. However, I’ve heard people be excited about the idea of mapping data without a critical analysis of whether the data need to be mapped. Mapping data — like any other way of communicating it — needs to be done thoughtfully to ensure a clear value to the end user, and to the process of communicating.