I was asked recently on twitter a question that I’ve been asked in one form or another several times since I became a Senior Data Visualization Engineer at Netflix:
In all my time writing about the different kinds of data visualization people and how data visualization is stuck in a conservative rut or might be choking itself professionally, I don’t think I’ve ever directly engaged with what, exactly, it is that I do.
I‘m always a little on edge when I get asked this question because normally a question like this comes from one of two sources: people looking to transition into the profession or people who work alongside the profession who want to better understand it. But in the case of data visualization, this kind of question also comes from people in the field who don’t believe that there is such a profession. So, I have to figure out if the person is asking me “How do I become a successful X” or “How do I work better with X/Do I need to hire an X for my team” or “Why do you insist that a skill is a profession — there are no Senior Microscope Engineers, you’re just a UI developer you can’t really expect me to believe you spend 100% of your time making pie charts you nasty little fraud.”
So for clarity’s sake, I’ll answer all three of those questions.
Data Visualization Engineering Skills
Data visualization is not purely technical, it has a strong design element and requires a deep understanding of the theoretical factors involved in communicating information visually. But all the theoretical and design knowledge in the world isn’t going to help you if you’ve artificially constrained the possibility space because you don’t know what’s technically possible with the tools you use. So while it’s not entirely technical, it is strongly technical, and there’s no getting around it.
That said, the specific skills necessary to engage in the day-to-day tasks of a data visualization engineer depend on the technology in use where they work. If the go-to method for presenting data visualization is notebooks or BI tools, then they need to be an expert not only on their use but also on how to push and modify them beyond their traditional boundaries. For me and for many data visualization engineers, it means you have to have a solid understanding of UI development so that you can create not only custom data visualization elements but also the application and data services necessary for those elements to respond to the spectrum of activity that your stakeholders demand.
Regardless of the technology, successful data visualization engineers have to understand principles of design, both graphical and more generally user-centered design. Analytical applications need to do more than technically fulfill some specifications, they need to actually enable readers to find the insights they expect to find with the context they need. From a skills perspective you need to develop in three distinct areas:
- Technically, you need to rigorously hack at data visualization methods and really be able to reproduce any of the charting methods you see. This isn’t because you get merit badges for each chart type but rather because data visualization is fundamentally combinatorial and the most effective analytical applications are not a single chart but a combination of several forms of information visualization married together. You can’t make that or even plan for that if you feel like certain channels, layouts or other methods are off limits because you don’t know how to implement them.
- Theoretically, you need to be able to understand the fundamental principles of visual display of information. You need to know why certain visual structures resonate with viewers and which are the most effective ways to encode information into graphics. It’s not enough to ape the derision of data visualization experts and proclaim “pie charts are bad” or “color is hard”, you have to actually understand color and connection and size and how they deliver information.
- Practically, you need to think of yourself as a designer first. The real challenge with data visualization is finding out what your readers want, which only happens if you can distill the problem as expressed by their feature requests, the artifacts they’ve already produced and the actual structure of the data. This touches on interaction design, information design and graphic design. That’s a lot of design.
Roughly speaking, I passed through those different periods in that order but I don’t think that particular path is the right one to be an effective data visualization engineer. It’s more important that you recognize which mode you are in, and whether you feel you are as advanced in each area as you are in your strongest.
Do I Need a Data Visualization Engineer?
Specialized positions that focus on data visualization are just like specialized positions that focus on design or data engineering. In small companies or in otherwise constrained environments where the only person who needs to understand a data visualization product is the person who made it or their team, a specialist might be superfluous. The value of a specialized position comes in three primary situations:
- Large-scale applications built for multiple stakeholders across and organization who do not all share the same context or level of domain expertise. In these cases, the data visualization specialist can through interview with stakeholders and review of existing data visualization products synthesize an application that is not only as effective as the existing tools for stakeholders but is more broadly accessible.
- Views into complex data-driven aspects of a business, especially those being supported or supplanted by machine learning, where decision makers need to trust the algorithms that are replacing their intuition. Data visualization of the performance of algorithms for the purpose of identifying anomalies and generating trust is going to be the major growth area in data visualization in the coming years.
- Building prototypes. Along with large-scale applications, there are experimental approaches that can be better served with novel data visualization, or bespoke products meant for very specific audiences where the form can be more exotic than a traditional dashboard. Success in those cases depends on someone having the talent to be able to build complex data visualization but also have the understanding of what rules can be bent or broken in very specific use cases.
If you want to be successful in data visualization in industry you need to understand the job. While the role varies, it mostly involves developing views of data for stakeholders, who typically are executives, product managers and data scientists. It differs from an analyst role in that the focus is not on a question but rather on an audience that typically needs something more than a single report and who expects views into the data that generate more than just the expected insights. This means fulfilling whatever basic feature requests the stakeholders have but also bringing with those features novel views into the data that leverage a familiarity with the capabilities of advanced data visualization. All that means fostering ambitious and innovative views that stakeholders may initially be uncomfortable with.
Those stakeholders will present their feature requests in different ways depending on their own practice.
- Product Managers will typically talk about the data itself, and often fall back into the refrain “just show me the data”. They ask for download buttons so they can get a CSV or the queries that drive the view, and while ad hoc analysis will always occur, it’s these requests that are the most fertile for data visualization, because when you dive into them you find out that they perform ad hoc requests to reveal patterns that are not so easy to see with off-the-shelf tools or traditional data visualization methods. Difference charts, connected scatterplots and boxplot series have all come out of requests to “show me the data” that, when translated, really meant they wanted to see some higher order structure in what they thought could only be presented as a time series or a bar chart.
- Data Scientists might come to you with the same questions but often come with feature requests that look like notebooks. Using ggplot2 or the equivalent, they have a preferred data visualization method that they were able to develop in an analysis of a fixed dataset that now they want to see integrated into a more broadly accessible dashboard or other internal application. This requires that you understand how to recreate what can be significantly complicated charts and, even more challenging, how to make those charts interactive and dynamic.
- By executives I don’t necessarily mean C-level but decision makers who look to data visualization for context and high-level insights. This could be the stereotypical “busy executive” that shows up often in data visualization manuals, but it could also be the audience for a presentation that doesn’t have the same depth of knowledge about the source material. In these cases, a data visualization engineer needs the skills and knowledge to help facilitate communication of insights and points of interest using color, annotations and other techniques drawing on visual cognition.
But Is It a Profession
When I wrote If Data Visualization is So Hot, Why Are People Leaving? I received, as I’d hoped, some pushback about whether or not there even was such a thing as a “data visualization engineer”. Among those pushing back was Stephen Few, a respected author of multiple data visualization manuals. That pushback was no surprise, rather it echoes what I’ve heard many times within the data visualization community. Many people that you would think of as colocated with me in some tidy Venn diagram instead bristle at the idea of data visualization being a true profession, and prefer to label themselves something more generic, like a UI or Full Stack Developer, or even go so far as to orient their career in such a way as to deemphasize their earlier focus on data visualization in favor of a focus on data engineering or data science.
When I first started writing this piece, I planned to directly engage with these arguments, but now, having dealt with the technical and organizational aspects of what it takes to be a data visualization engineer, I feel like I don’t have much else to add. Only that: data visualization is a broad and challenging field and the communication of information using graphics relies enough on a specific set of theory and practice that it justifies a specialist as much as any of the other data-oriented careers.
As an aside: I like Stephen Few, and more importantly I think he’s good for the field. I met him once and he was thoughtful and engaged even though I talked about network data visualization, which would excuse any ill behavior from any person. I don’t think he’s 100% correct about data visualization, but I do respect his willingness to actively and publicly engage on the subject, and respond critically when he thinks someone isn’t being rigorous or professional. It’s a shame, in fact, that there are so few people willing to engage publicly in a critical manner and that when people do they are chastised as impolite. If Stephen hadn’t responded publicly to my earlier comment, I’d be forced to refer obliquely to other data visualization theorists who have said the same thing but only in private communication, which not only makes for weaker writing, it makes for a weaker profession.