Dispatches from the data visualization gutter
All fields need a diverse fauna, and data visualization is no exception. We need innovation-driven practitioners to keep pushing the limits, we need to do useful client work and fundamental academic work, we need to make sure the data visualization tide indeed lifts all boats.
This post follows Elijah Meeks’ If Data Visualization is So Hot, Why Are People Leaving? and Moritz Stefaner’s There be dragons: dataviz in the industry on the role of data visualization in organizations, and also after listening to Data Stories episode episode 95, with Elijah. It has more humble roots and ambitions, since it’s the point of view of a dumb, 3D pie lover, not-entirely-human (datavis-wise), Excel user (please turn your sarcasm/irony detector on). As you know, from this point of view nothing relevant happened in the field since Microsoft changed the chart engine back in Excel 2007.
The world of Excel data visualization
For the sake of argument, please accept for the moment that “Excel data visualization” is not an oxymoron. Here is how this space looks to me.
Mocking Excel users for fun & profit
The image on the left comes from an Office support page, and it somewhat legitimizes mocking Excel users, or not taking them seriously when it comes to data visualization. This an easy, pleasurable and inconsequential activity we often indulge in (very much like discussing the role of pie charts in data visualization). After all, Excel charts are ugly as hell, chart functionality and effectiveness are surely below zero, data structure and EDA are foreign concepts to its users, and in fact their lack of curiosity and their general bovine attitude prevent them from going beyond even basic Excel and the Excel chart library. Subject matter tends to be the only dimension they are comfortable with.
It goes without saying that any self-respecting graphic designer would consider getting her adrenaline rush from watching paint dry, if the only other option was the instant narcoleptic effect of any Excel chart. At most, she would recognize Excel was used at a very early stage of data manipulation if, and only if, an Excel user was present (for a kumbaya effect).
Nobody goes there anymore. It’s too crowded.
Excel is the least sexy tool in data visualization. A punching bag. A scapegoat. Where are a few reasons why:
- Excel is a commodity. Since everyone knows Excel, any Excel-based activity, project or training course must be 10x cheaper when compared to using any other tool.
- Negative allelopathic effect: Because Excel is used as a generic tool, many of its capabilities remain unexplored or wrongly used, and since you don’t push the limits, you don’t feel the need to switch to (better) specialized tools. Hence its allelopathic effect (“a biological phenomenon by which an organism produces one or more biochemicals that influence the germination, growth, survival, and reproduction of other organisms”): Excel becomes the only tool for all things numerical, even if it isn’t the right one.
- One-size-fits-all skills. Excel is a low cost solution, and its ubiquity makes sharing its proprietary format a non-issue, most people are familiar with it, training costs can be kept to a minimum. I suspect most people making charts in Excel couldn’t care less about making charts. Quarterly reviews force them to. Templates save a lot of work and they look awesome, don’t they? Most Excel training assumes that specific skills like data visualization will magically flow through formulas and formatting options, and that there is no difference between “making charts in Excel” and “making charts”.
- Competitors, whose discourse can be summarized in a paragraph: Excel is dying and this new tool will do everything you wanted to do in Excel but were afraid to. Yes, the news of its death were greatly exaggerated in the past, but this time is different. Because, you know, big data.
- Microsoft itself never positioned Excel as a viable data visualization tool, quite the opposite: it undermined it with stupid defaults and “chart types” like cones and bloated 3D options. Excel charts could be much better, if Microsoft didn’t ask the marketing department to design them and target them at a very low graphically literate audience. And, because of Eternal September (constant influx of newcomers), there will always be a fresh supply of new users ready to accept Microsoft’s vision of what works best for its own sales.
- No Excel data visualization community. I can’t remember seeing an Excel version of something like #MakeoverMonday. With a few exceptions, the focus is on tips and tricks, where contextual knowledge/ best practices are never discussed. There is a new (!) community, but it’s kinda depressing so far (still focused on Excel tips & tricks).
- Your charts are prettier than mine. Because Excel users often lack basic data visualization skills, evaluating a chart is based on/reduced to one’s sense of aesthetics, where everyone is entitled to an opinion.
The elephant and the butterfly
Most Excel users work for large organizations, which tend to be risk-averse and try not to change things if they don’t look broken. If moving slow is a conscious strategy (consistency, avoid technological fads, costs) and not a result of inertia, it’s okay to move like an elephant, not a butterfly.
An organization can be both an elephant and a butterfly, especially if it is in the market for a And-Now-for-Something-Completely-Different thing. For a good data visualization butterfly example, check OECD’s Better Life Index. Unfortunately, many other examples look like freaked out elephants jumping gracefully on water lilies. This happens when individual or organizational graphicacy (data visualization literacy) is low, making them easy targets for vendors/graphic designers of “memorable”, “appealing”, “professional” and “fun” data visualization “solutions”.
One of the first blog posts on my blog was a Letter to the Director-General of Eurostat, where I listed several examples of charts that had no place in a Eurostat publication. Now, pay attention: this was ten [insert expletive here] years ago. In my presentation at NTTS2017 (a Eurostat conference) I sent the same message. Again. The only substantial difference was that most of my examples were not from Eurostat: I had no intention to single out Eurostat when most statistical institutes share the same misconceptions.
Being a “statistical institute” is not the issue in and by itself: we can understand its elephant-ness. The problem lies elsewhere: not having someone with a data visualization background that actually cares and can talk some sense into their heads, improving the elephant and making sure the butterfly doesn't suck. Yes, it can be done, even with Excel.
There is a bright side, though.
Excel is very flexible and it lets you go much beyond the chart gallery. Jon Peltier made me aware of this many years ago. Ann K. Emery, Stephanie Evergreen, Jon Schwabish or Cole Nussbaumer Knaflic show in their books and blogs that you don’t have to remain behind bars (pun intended). Also, you are free to build bridges to other data visualization users without leaving Excel.
Just a quick example. Salvaging doomed charts is especially amusing to me. Here is my attempt do design a respectable speedometer, by adding multiple pointers and encoding a time series to each of them:
It can be hard to make a specific chart in Excel, but the real question is: can it be done cost-effectively? Do you really want to spend hours or days designing small multiples in Excel when other tools make them available out-of-the-box?
Because of its low cost, Excel is also a nice prototyping tool: you can create a fully functional chart or dashboard and use it to get user feedback or evaluate other tools.
Minimum data visualization literacy…
It’s hard for you to respect a thing whose mere existence you don’t recognize or you are not aware of. We do need a minimum data visualization literacy (Elijah Meeks’ words in the podcast). We need to become aware of how certain design choices impact the effectiveness of our visual communication.
Elijah believes that this is dangerous: it focuses on “maximizing numeric precision” and, worst of all, it “becomes inoculation almost against any kind of challenging or more advanced chart types”. Also, “the people that hate pie charts the most are the people that have that level of data visualization literacy”. He cites Stephen Few’s books as examples of promotion of this level of literacy.
While I don’t disagree with Elijah on this, I tend to be more optimistic. I see a minimum data visualization literacy as a necessary first stage of a rite of passage where you have to get rid of a truckload of long-held misconceptions, fight peer pressure and, at the same time, try to keep internal and external clients.
I wrote this in my book:
However, I remain concerned that this [3D effects and pie charts] may be the only part of the message received outside of the data visualization community. I’m also concerned that, stripped of these two mainstays of antiquated visual representations, people now feel lost between a world that is no longer theirs and another that they’re just beginning to explore.
Horror vacui and all, maybe people need to replace those misconceptions with a fresh set of dogmas. Temporarily, I hope.
… and beyond
Elijah is right when he says that there isn’t much structured training beyond the minimum data visualization literacy.
Things are a bit specific in the Excel world, where run-of-the-mill charts are the norm and where most people don’t change the defaults. We need good defaults and templates and a fast way of designing charts that adhere to the organization’s style guide (like this example). But we also need to understand the limited usefulness of these prepackaged messages, and tailor them to meet specific needs (or simply create something new from scratch).
Any training needs to take into account production processes and the expected skill set: many graphic designers need to improve their numeric literacy, while Excel users need to understand that design is not about making pretty things. Here are a few ideas (again, from my book) for this training, tailored for Excel users:
- Improve workbook structure and tidy your data;
- Better understanding of how statistics and visualization can work together in a symbiotic mutualistic relationship;
- There is no spoon. There are no chart types, only data points and visual objects and variables to help make sense of their relationships (network and geographic data have a different nature);
- Narratives, stories, landscapes: call them whatever you want to, but break the the mental barrier around the single chart;
- Less is more, but then add, change, emphasize or move things around;
- Approach color from a functional point of view;
- Aesthetics (as a starting point) have no place in Excel data visualization.
But, is training enough?
The best way to learn a language is to move to a country where it is spoken and be forced to use it 24/7. Everything else will slow you down. Think of data visualization as a language: you must use it and be surrounded by it. You can’t assume a one-day training course will change your life forever: it can only show you a path.
Getting trapped in a no-man’s land is the likely outcome of a one-day course. As I said above, people become painfully aware of how stupid using canned pseudo-3D effects is, but that’s about it. What to do next? Having some standards and style guides can help, but I don’t see how a successful change in data visualization practices can be accomplished without guidance (call him/her consultant/mentor/tzar/manager, whatever). Data Visualization Tzar sounds great, though. I wouldn’t mind.
You know Excel users have a conflicted relationship with aesthetics, right? But one can argue that a harmonious relationship with aesthetics often hides a conflicted relationship with the data, and failures in the most basic numeracy and visual communication. On a level playing field, “not really data visualization” can’t be applied to Excel datavis alone:
Yet, as I argued earlier already, I don’t think we gain much from overemphasizing the (supposedly) fundamental differences between “serious/functional” and “aesthetic/entertaining” data visualizations, or, conversely, diminishing Excel dataviz work as “not really data visualization”.
What matters here is that aesthetics is a fault line in data visualization. Since a beautiful object is also seen as more functional (I dare to say that the other way around also works) there is a lot of overlap between form and function, and your visualizations will always have both. If you redraw David McCandless’s Colors in Cultures you don’t remove aesthetics, you just re-frame it.
The problem with Excel is that emphasizing aesthetics results in bad makeup. That’s why each object in an Excel chart must have a functional reason why/how much before adding anything else that can be seen as mainly cosmetic. Combined with a clear editorial dimension, and freed from “chart types”, you can end up with a pretty and interesting chart. Conversely, if you don’t have the right design skills and you think you can compensate by using canned 3D effects and bright colors, you are doomed. Just like the graphic designer’s grand idea that fails to encode the data.
Excel data visualization community? In your dreams!
When I’m in a good mood, I see Excel users as hobbits:
“Excel users have been living and making charts in the four Farthings of the Industry for many dozen years. Quite content to ignore and be ignored by the world of the Big data visualization Folk — the field being, after all, full of strange creatures and tools beyond count. Excel users must seem of little importance, being neither renowned as great visualization experts, nor counted among the very wise.”
I wish this could be remotely true but, more often than not, the Excel data visualization ecosystem feels more like a tips & tricks gutter than a Shire. Let Ann tell you about the size of that community:
Hum. Probably the Las Vegas Convention Center for our next meeting is out of question for the near future.
Actually, it is huge. Potentially. There are hundreds of millions of Excel users, and probably many of them make charts on a daily base. Even if only a fraction go beyond defaults, it’s still a huge number. More than enough to help me retire early, if I could convince them to buy my book (hint hint).
This final section is a bit more personal. Like many, I’m blocked by Edward Tufte on Twitter. Being blocked by Tufte and writing a book labeled as “lite” by Stephen Few is quite an achievement! Now, I couldn’t care less about being blocked by Tufte, but Few’s label saddens me, and I don’t really know what he means. Since everyone tells me he is such a nice guy in person, I’m still hoping to discuss it over a glass of wine.
This brings us to how successful you can be at selling your ideas. It’s hard for an introvert to disagree with Elijah:
You can’t say the only people who can succeed in data visualization are those that are individually, socially, power… you know, good at that kind of things. We want people who can succeed in data visualization who aren’t good at social cues […], and right now it very much seems like it is. In fact, I would say almost all of the successful people I see in data visualization have that, and I don’t think that that should be a requirement.
I don’t think this covers the whole story, though. Excel is a liability in many cases, and don’t get me started on the added impact of living in a small, non-English speaking country like Portugal, where I live. It doesn’t have to be that small and irrelevant: Alan M. MacEachren, author of How Maps Work, said that data visualization would be very different today if we didn’t have to wait until 1983 for an English translation of Bertin’s Semiologie Graphique (published in 1967 in French).
I cherish this idea that a familiar tool like Excel could be used to learn data visualization and improve data literacy for everyone. Most people just need that gentle push to begin their journey.