Thoughts on higher-level visualization libraries & the visualization industry
This weekend, I attended some of vizfest (d3.unconf). Like most conferences, I had opinions at times I didn’t feel extroverted enough to raise my hand. One of those cases was the discussion of higher-level visualization libraries. Treat this post as a hastily-written braindump of my thoughts, factoring in that
- I don’t actually create many visualizations, being mostly a ‘meta’ practitioner in fields. Heck, when I worked in maps, I rarely made maps.
- I already have a stated opinion on the issue of How React and d3 Fit Together: that opinion is essentially, they just work as long as you only use one for the DOM, and everything will be fine. My favorite d3+React framework is no framework. Reference point 1: I’m primarily an engineer, not a visualizationeer.
- I’m pretty comfortable with d3, and I have a stated opinion on the value of saving typing: usually software that focuses on code succinctness is missing the point.
The observations about higher-level d3 visualization libraries are, well, that there are lot of them. A lot of them have moderate internet fame, but in comparison to d3 “the library that conquered an industry,” none of them have broken through to become defacto standards. And few have become successful open source projects with escape velocity: most are maintained by their creators.
But, to get to the point: “higher level d3 libraries” is not a well-defined-enough category to yield productive conversation. Specifically, I see types of libraries:
- Libraries that encode best practices. These are newspapers or startups or people who strongly believe that their choice of scales, fonts, presets, and other details are right. And, often, they are right: there are good designers in the biz, and it feels nice to riff off of their tasteful font choices.
- Libraries that help you avoid d3. There are many ways to phrase that but: libraries that attempt to abstract away the required d3 aha moments, like data joins. These libraries are often pitched with the idea that visualization creators should think about visualizing data, not some peculiarity about the tools involved in visualizing data.
- Libraries that introduce higher-level concepts to visualization. Fewer of these are in the wild, but Semiotic is certainly one of them: Frames are its innovation, and they’re of the type that “charts follow the form and meaning of data”, and thus Semiotic introduces a flexible typology of charts, though still a typology.
These intents are really different: there are some libraries that cover two or three of them, but many libraries that are mostly just one or two. And I think that each provides you with a flippant-but-nonetheless-pretty-compelling explanation for why they didn’t catch on:
- If you write a library to encode your best practice, you’re baking your opinion into it. You might have a really great opinion: you might be Tufte, and people around the world might agree that your fonts and shade of yellowish are truly the best fonts and the best shade. But nonetheless, people who use your opinions to bootstrap their own will end up with their own strong opinions that conflict with yours, and they’ll decamp to something that lets them express their opinions and tastes precisely — like a lower-level library.
- The final category is arguably the most interesting: as Elijah said in person, the point of Semiotic is that the ideas are useful, even if the implementation doesn’t catch on. That’s a tricky thing to pull off right now, in part because Semiotic is halfway between d3 and React: likely many users will use it because they’re using React and want a ‘best practice’ for visualizations, and others will assume it’s another d3 wrapper. One of the things I’ve learned over time is that actual capital-I Inventions are rare in the tech industry, and because inventions are invariably paired with implementations, they often get confused with implementations. As Bret Victor said in a talk I can’t dig up, “I feel like I’m inventing the process of cooking, but all people are interested in is the omelette I made most recently”
Some of the existential panic around whether Data Visualization people are abandoning the industry was related to these issues: that while we want data visualization to be about aesthetic, perception, and representation choices, we spend most of our implementation & discussion time doing & talking about data cleaning, bug-fixing, and so on. If the art metaphor can still be employed, everyone’s thinking about the right paintbrushes and the right way to mix paint and not about composition. Or something like that.
Which — absolutely, that makes sense. And Elijah notes
We lack clear success stories for using complex data visualization in an industry setting.
Though the conference included people from plenty of prominent tech companies, I don’t encounter their charts when I waste hours watching streaming entertainment on websites or dollars riding Rand taxis. If anything, the transition from the old world of R & Python to the new one of d3 & React is that charting technology is written in a way that it can be beautiful enough for the front page of a newspaper and functional enough to be a core feature of a product.
Which makes you think: if most of these jobs are inward-facing, is the beautiful, interactive visualization that these experts are producing causing change? Is it fulfilling the purpose? I assume, of course, yes, and that bosses eventually come around to being data-driven and so on.
But, critically, visualization in many organizations is an operational concern — visualization means dashboards for bosses, salespeople, and engineers. Visualization isn’t part of the product. So, given startupland’s unfair-but-nearly-universal value judgment of engineering above operations, visualization engineers who want to “level up” understandably want to ship something to customers: so they switch ladders.
If visualization is primarily an inward-facing communications tool, would it make sense to use less expressive, less powerful means to create it, to match the often-lower expectations of internal communications? ggplot2 and friends are still available, and still make nice, non-interactive charts in a fraction of the code that’d be required on the web platform.
Fin. Take these thoughts as a jumping-off point, or a way to reframe the conversation to get more at the core of where visualization is going. Yell at me on Twitter if I’m wrong.