The Trouble with D3
Recently there were a couple of threads on Twitter discussing the difficulties associated with learning d3.js. I’ve also seen this come up in many similar conversations I’ve had at meetups, conferences, workshops, mailing list threads and slack chats. While I agree that many of the difficulties are real, the threads highlight a common misconception that needs to be cleared up if we want to help people getting into data visualization.
The misconception at the heart of these threads is that d3 and data visualization are the same thing. That to do data visualization one must learn and use all of d3, and to use d3 one must learn all of data visualization. Instead, I like to emphasize that d3 is a toolkit for communicating complex data-driven concepts on the web.
What I want to get across here is how we can get a more holistic view of d3’s role in web-based data visualization. Let’s use a metaphor inspired by Miles McCrocklin where data visualization is likened to building furniture. All kinds of people might get into building furniture, for all kinds of reasons, especially when they see the beautiful things other people are making:
People see the impressive output and naturally desire the ability to make it themselves, they ask how it is done and often hear “it was made with d3.” This is the start of the problem, because when someone hears that it was made with d3, they think “oh, I should go learn d3”. They go over to the documentation and see something like this:
Many of these tools seem baffling, they require knowledge about woodworking and processes we’ve never thought about before, or even knew we might need to think about. We feel overwhelmed and discouraged, it seems the path to something that seemed within reach is long and treacherous.
This is where I believe we can change things for the better, rather than changing the toolset we can guide people based on their goals along more suitable paths for them. Let’s examine a few common situations where people find themselves wanting to do interactive data visualization and how we might plot a better course for each.
The designer
Our designer is already comfortable communicating ideas visually, they know how to break down complex problems and map them to relatable concepts. They have a suite of tools that enhance their ability to express whats in their mind. They often are not very familiar with programming, perhaps they have some experience with basic HTML and CSS for putting together static web pages. They’ve seen what people can make with d3 and are driven to be able to do the same. When they try to understand what looks like a very small amount of code in a bl.ock they get very confused.
What part of this is JavaScript? What part is specific to d3? What is an asynchronous request? What is this DOM I keep hearing about?
For these folks, d3 offers great power and flexibility, but first they must learn some foundational technical skills to operate in this environment. I often recommend Scott Murray’s excellent d3 tutorial (and book) which covers basic HTML, CSS and JavaScript concepts. I also recommend experimenting with exporting SVG from design tools like Illustrator and Sketch and imbuing them with interactivity and data magic in the browser.
When starting out, I often encourage designers not to focus on the enter/update/exit pattern, reusability or performance concerns. It’s much more helpful to focus on getting the desired output, once you have something close there are lots of friendly folks that can help you make it more performant or robust.
The analyst
Our analyst is already comfortable working with data, writing queries and calling powerful functions with complex APIs. They have a workflow in a powerful environment like R Studio or Jupyter Notebooks. Most likely they come to d3 because they want to publish their analysis in some way. While the analyst is typically more comfortable programming than the designer, they are likely not familiar with the idiosyncrasies of programming in a web browser environment.
What is the difference between SVG and Canvas? What is the JavaScript equivalent to Pandas/Tidy? Why can’t I draw a line chart with an SVG line? What is this “d” attribute on a path?
For these folks I also recommend a primer in web development to familiarize themselves with concepts like the DOM. Again, my favorite starting point is Scott Murray’s d3 tutorial (and book). I would also recommend a crash course in JavaScript and JSON, exporting data from their normal environment as JSON for visualizing in the browser.
When starting out, I often encourage analysts to ignore a lot of d3’s utility functions, as they are probably more familiar with the powerful functions in their own environments. Instead, I think its best to focus on exporting the data into an easy-to-consume JSON or CSV format that matches existing examples.
The software engineer
Our software engineer is an interesting case, because although they have a lot of the foundational skills and knowledge around web development, some of d3’s tools require a foreign way of thinking. In our metaphor, the engineer doesn’t just care about making furniture, they are working on the entire building. There are frameworks and infrastructure that the furniture has to fit inside.
What is this enter/update/exit business? Why are you messing with my DOM? Transitions… How do I unit test those?
Many developers will already be intimately familiar with the DOM and JavaScript, so my advice is to actually try and ignore the parts of d3 which focus on the DOM. Instead, become familiar with some key utilities for data visualizations like d3-scale. D3 is broken up into many smaller modules so it’s pretty easy to cherry-pick the functionality you want to use.
I also emphasize separating the layout of data from the visualization, so using a module like d3-hierarchy you can generate a data structure with d3 and then render it into the DOM using your framework of choice.
Silver bullets
These situations are loose archetypes, many people will fall somewhere between them and that’s perfectly fine too. The idea is to separate out the goals and constraints so that we can better guide the diverse folks entering our community.
I personally think of web standards as the lowest common denominator for global communication. The graphics APIs are not ideal but if you want to instantly distribute your data-driven experience to billions of people I think it’s reasonable to pay the price of a relatively steep learning curve. The underlying concepts of 2D graphics, visual design, user experience design, information architecture and programming all transfer directly to many other endeavors besides data visualization.
But sometimes, a chair is just a place to sit, we don’t have time or money to care that much and IKEA will do just fine! In those cases there are plenty of charting libraries that only need a little bit of configuration to get going.
Elijah Meeks has made a great map of the d3 API that breaks down the toolbox into useful categories in his recent article.
I’ve also attempted to map out the d3 learning landscape in my article The Hitchhiker’s Guide to D3, which gives some links and starting points for what I believe are some of the more essential concepts.
A while back I interviewed a handful of data visualizers who learned d3 in the process of expressing themselves and the datasets they cared about. The common theme was that they had started with goals. They learned what they needed from d3 along the way to achieving those goals.
So grab a map and plot your own course through the vast world of Data Visualization. You can find some trails others have blazed with Blockbuilder search, try out JavaScripts very own Notebook environment Observable, and join over 3,000 like-minded chair makers, I mean data visualizers on the d3 slack channel.
Good luck, I look forward to seeing your visualizations!
I’d like to thank Erik Hazzard, Kerry Rodden, Zan Armstrong, Yannick Assogba, Adam Pearce and Nadieh Bremer for their feedback on this article.