2019 Annual Data Visualization Survey Results
In 2017 and 2018 I ran an exhaustive survey of the data visualization community to try to better understand the trends and issues that were most important to our community. With the founding of the Data Visualization Society, I thought it made sense for that survey to pass over to the DVS so that it could be better designed and its responses better handled by a real organization, rather than a single individual.
The 2019 edition of the survey ran from May to June and received over 1350 responses. I’m happy to announce those responses have been cleaned up and are publicly available here. You can find the 2017 & 2018 data here.
The results maintain a theme seen in last year’s survey that, at least among respondents, we are seeing a real youth movement in the profession. Half of the responses indicated 3 or fewer years of experience. Gender diversity still skews dramatically toward men, who outnumber women practitioners 2 to 1.
There’s an interesting detail in the questions that posed a few standard phrases about data visualization and asked how strongly the respondent agreed or disagreed with them. One question asked whether respondents were on the lookout for new tools. The answer was one which we might expect, with a significant agreement that new tools and techniques are important to success.
But if we examine the results of a related question, we see that respondents don’t think the tools themselves are keeping them from being successful. Rather, more respondents thought it was their skills, not their tools, that held them back.
Annual Survey Visualization Challenge
This overview will soon be joined by more robust and interesting deep dives into the data. That’s because the Data Visualization Society is running a competition, with celebrity judges and cash prizes, for the best data visualization made with the survey results. I’m sure the results of that challenge will provide great insights for our community using this rich dataset. You can see the full description of the challenge here.
I didn’t want to dig too deeply into the data, beyond what was necessary to clean it and provide a simple overview above, but I noticed that there was a theme to the responses I saw that was summarized by another of these questions.
If we look closely, it goes beyond this question and shows up throughout the free-text responses. For instance, what are our biggest frustrations?
In 2018, the theme among responses was most decidedly that the data visualization community felt like it needed to improve its design skills. It was clear, regardless of whether you considered yourself a scientist, engineer or designer, that respondents wanted to invest more in learning design than new tools or techniques. I wrote about this in the context not only of the overall community but also with a specific focus on responses from R users.
This year, time is a clear theme in the data. Throughout the responses, whether asked specifically about how much time they have to make data visualization or just generically what they needed to make better data visualization, the community responded with “time”.
So… time to do what?
If we generate a word tree from the frustration responses, we can see that it falls into three major categories:
- Time to learn more about data visualization
- Time to create data visualization
- Time to spend using exploratory data visualization.
The image above shows only a few of the sentences, the full (much messier) word trees are at the end of this piece. These trees aren’t based on all of the responses to that question, just the responses that use the word “time”.
This aligned well with my own experience professionally and the concerns I’ve heard from fellow practitioners. Data visualization, while growing more important to organizations, has been pigeonholed into being a data access problem or a nice-to-have optimization at the end of the “real work”.
Data Visualization as Data Access
We’re awash in data and organizations are constantly investing in and generating more data every day. To act on that data, we need to see it, and to see it, we need data visualization. Seeing and understanding data as a resource, its lineage and the anomalies in its storage and generation is a critical need, but if data is seen as the only reason for data visualization, an organization will not legitimize and invest in design. Instead, they’ll spend all their time checking off lists of features and settling for tools and libraries that boast how well they plug into various data and promise to deliver results with little or no additional effort. Those tools, while useful for browsing big data stores, are not well-equipped for building powerful explanatory data visualizations.
Explanatory Data Visualization as Optimization
The other constrained view of data visualization is that it is a step to be performed after the “real work” of analyzing the data. In this case, the tools and roles that an organization optimizes for will be focused on exploratory data visualization. Explanatory data visualization, which requires time spent in design and access to tools that allow you to annotate and create rich UX, isn’t the only thing to lose out, it also promotes a culture where “if you’re smart enough, you can read this crappy chart”.
This creates a natural tension around data visualization work that it is important but not necessary, so it’s something that can be put off. As a result, tools and techniques are focused and optimized on doing the best you can with limited time, reinforcing the secondary characterization of data visualization. That might explain why people are looking for more tools even though they feel like their tools aren’t holding them back: they’re being evaluated not on how good their data visualization is but how fast it is.
We have enough tools with enough capabilities. We have enough workshops and guides and books. What we don’t have is enough time to do our jobs. That can only happen if practitioners, organizations, and leadership acknowledge the validity of investing in data visualization at the same level that they invest in the other pillars of data: data science, data engineering, and data analytics.
APPENDIX: FULL WORD TREES: