The past and future of data visualization
This weekend a friend of mine stumbled across old gold in the form of Willard Cope Brinton’s 1939 book Graphic Presentation. It’s available to read online (from the always awesome Internet Archive). I highly recommend you spend 10 minutes perusing the book’s many example graphics.
There must be 1,000 varieties of charts and graphics catalogued in the 500 page book, and despite being 75 years old, it still feels incredibly relevant today. I was particularly struck by the introduction titled Magic in Graphs:
There is a magic in graphs. The profile of a curve reveals in a flash a whole situation — the life history of an epidemic, a panic, or an era of prosperity. The curve informs the mind, awakens the imagination, convinces.
Graphs carry the message home. A universal language, graphs convey information directly to the mind. Without complexity there is imaged to the eye a magnitude to be remembered. Words have wings, but graphs interpret. Graphs are pure quantity, stripped of verbal sham, reduced to dimension, vivid, unescapable.
- Henry Hubbard
If you look closely, you’ll even find examples of arguments the blogosphere is still fond of having today (can line charts have non-zero Y axes?), and early versions of charts that you thought were invented just yesterday.
Hubbard closes the introduction on a forward-looking note (emphasis mine):
Wherever there are data to record, inferences to draw, or facts to tell, graphs furnish the unrivaled means whose power we are just beginning to realize and to apply.
Clearly, Hubbard was excited about the future of graphic presentation. Surely it would bring multitudes of innovations as we established a grammar of graphics and figured out how we should best present data for consumption. While there were 1,000 chart varieties in the 1939 edition, who knows how many there would be by 2000?
Unfortunately, my friend Ed has bad news for Hubbard — after reading through Graphic Presentation he exclaimed, “I’m really shocked at how little innovation there has been in 2D data visualization in the last 80 years!”.
Always the optimist, I’m here to take up the challenge. What’s new since Brinton’s book? In the world of static, 2D data visualizations, I think Ed is mostly right, with a few exceptions. That doesn’t mean we haven’t innovated though— there has been tremendous progress in and around the graphic forms that were already established in 1939.
To better understand the progress in the years since Brinton’s book was published, first we need to understand how we got there. When were all these graphics invented anyway? Is it fair to expect a boom in new chart types every decade?
Data visualization’s past
It is easy to think of the basic charts and graphs of today as universally understood forms that have existed forever.
But many chart types that we use everyday (whether as visualization professionals or as students using the Excel chart wizard for the first time) are relatively recent inventions.
Working off of this nice taxonomy of chart types, I tracked down the earliest known dates for as many visualizations as I could. As you’ll see from the timeline below, recognizable modern charts began to be used in the mid-18th century (William Playfair is famously credited with the invention of the line, bar and pie chart), and the rate of innovation was rapid through the next 100 years.
The years between 1812–1855 in particular featured three of the most famous and influential visualizations of all time: Minard’s March on Moscow (a Sankey diagram before Sankey was born), Snow’s dot distribution map of the cholera epidemic in London, and Nightengale’s polar area diagram of the causes of death in the Crimean War.
After this furious pace, the rate of innovation as measured by brand new visualizations slowed substantially. There are notable exceptions (histograms, cartograms, heatmaps, treemaps, etc.) and I am purposefully trying to stick to general purpose graphics —there have been quite a few domain specific visualizations that have proven very popular, such as these information-dense circular diagrams of genomic data:
That said, rather than focus on the slowdown in the introduction of broadly used, standard chart types, let’s look at the latest era of data visualization with a wider lens. We’ve made much more progress than a count of new tabs on the Excel chart wizard would imply.
The last 75 years
The history of the last 75 years of data visualization runs in close parallel to that of many industries and fields, and it is unsurprisingly dominated by computers. While in the timeline above there are a few new chart types that appear for the first time only recently, the key innovations have not been in what the finished product of a data analysis looks like, but rather in how it is created and what readers can do with the chart when they see it.
Computers have made data astronomically easier to gather, process, analyze and manipulate, leading to expanded data visualization capabilities. This has enabled us to do four key things with data that Brinton could have only dreamed of:
1. Visualize data in real-time. Most commonly seen in today’s omnipresent dashboard displays, real-time analysis and presentation of data is a huge step forward from the kind of slow and painstaking work required to generate a single good chart before computers. Compare the ease with which you can spin up a real-time dashboard today with Brinton’s diligent approach — graphics were so time-intensive to create that he included entire chapters on the selection of quality paper and binding techniques.
2. Show a lot of data. This chart made waves for visualizing 10 million Facebook friendships. The translucent connections between friends overlap and brighten in areas with high Facebook friendship density, producing a beautiful pseudo-population map as the output. Good luck drawing that by hand.
3. Present data in motion. While related to the raw ability to visualize a lot of data, time series data in particular benefits from the fact that it can be played back as an animation. For example, this video visualizes 24 hours of flights in North America. Motion takes this visualization to another level — it is much more compelling than simply showing a count of how many flights took off and landed at each location.
Data visualization’s future
Finally, I see the fourth recent innovation around data visualization as the key area enabling graphic presentation to continue to evolve into the future.
4. Allow your audience to interact with the data. There have been fantastic developments in recent years in allowing readers and consumers of data visualizations to interact with data in meaningful ways. While a single representation of data brings the distinct perspective of its creator, interactive data apps allow the reader to not only see that story, but to discover more stories waiting for them in the data set. An excellent example of this is Hans Rosling’s Gapminder. You’ve probably seen his Ted talks, but you may not know that you can go and explore loads of interesting data sets using his interactive tools. Give it a try and see what story you can find.
If Hans’s data doesn’t interest you, there are plenty of other data apps and interactive visualizations waiting out there for you. One fun example: say you are a fan of FiveThirtyEight’s article series looking at an actor’s career and classifying their movies by box office gross and Rotten Tomatoes rating.
Using the kind of exploratory tools that have recently become available, you can easily recreate that analysis for your favorite actor or look at grouping on different variables. Shiny, one of the leading tools for this kind of interactivity includes just such a graphic on their website:
There was magic in graphs in 1939. Now, thanks to the power of computers, we have the ability to chart more data, faster, and we can engage the audience with animated or interactive visualizations — there is more magic in graphs than ever before.
Did I miss a key chart invention that I should have included? A key innovation from the last 75 years? Let me know at @uptownnickbrown on Twitter.
Find more of my writing here on Medium:
My answer to the question I am hearing more and more these days…medium.com
Last week, Bloomberg published What Is Code? a magnum opus by Paul Ford. Meticulously crafted in both prose and…medium.com
Or at http://uptownnickbrown.com/:
When tackling a problem like Ollie Roeder’s latest Riddler, it’s usually a useful start to break it down into a simpler…quanticle.co
Want to read up on the invention of your favorite chart?
- Gantt, Priestley, 1765 http://dataremixed.com/2014/02/visualizing-history/ and Gantt, 1910 https://en.wikipedia.org/wiki/Gantt_chart
- Candlestick chart, Homma, 18th C https://en.wikipedia.org/wiki/Candlestick_chart
- Line chart, Playfair, 1786 https://en.wikipedia.org/wiki/William_Playfair
- Bar chart, Playfair, 1786 https://en.wikipedia.org/wiki/William_Playfair
- Pie chart, Playfari, 1801 https://en.wikipedia.org/wiki/William_Playfair
- Moscow (Sankey) / Cholera (Dot distribution) / Nightengale (Polar area) , 1812, 1854, 1855 http://www.tableau.com/top-5-most-influential-data-visualizations
- Chloropleth, Dupin, 1826 (named in 1938) https://en.wikipedia.org/wiki/Choropleth_map
- Scatterplot, ~1830s http://www.researchgate.net/publication/7923211_The_Early_Origins_and_Development_of_the_Scatterplot
- Histogram, Pearson, 1895 https://en.wikipedia.org/wiki/Histogram
- Cartograms, 1911 example http://makingmaps.net/2008/02/19/1911-cartogram-apportionment-map/
- Heat maps, 1957 https://en.wikipedia.org/wiki/Heat_map
- Treemap, 1994 https://en.wikipedia.org/wiki/Treemapping
- Sparklines, ~1996 http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0003Y1