The Data Are Not Objective, and Neither Are You

Data Visualization Society members reflect on what a turbulent 2020 teaches us about working with data

Alexandra Khoo
Nightingale
Published in
7 min readSep 14, 2020

--

On a dedicated channel, #dvs-topics-in-data-viz, in the Data Visualization Society Slack, our members discuss questions and issues pertinent to the field of data visualization. Discussion topics rotate every two weeks, and while subjects vary, each one challenges our members to think deeply and holistically about questions that affect the field of data visualization. At the end of each discussion, the moderator recaps some of the insights and observations in a post on Nightingale. You can find all of the other discussions here.

Even if change is the one constant, we’re not comfortable living with uncertainty. A sense of foreboding can cause more pain than experiencing actual pain. In a Harvard Business Review article on crisis management, Geeta Menon and Ellie Kyung cite various research showing this phenomenon, whether it’s the threat of an electric shock or the perceived loss of job security.

Perhaps that's why there's a tendency to treat data as a proxy for reality, even when we shouldn't, at least not without further questions. It speaks about our human need for control. Data Visualization Society (DVS) members reflect on 2020, with a nod to our relationship with data and responsibilities as data practitioners.

“Raw Data” is an Oxymoron

The authors of the anthology, Raw Data is an Oxymoron, remind us that data are never truly raw like some natural resources:

“Some elements are privileged by inclusion, while others are denied relevance through exclusion.” — Daniel Rosenberg, Travis D. Williams

The very act of measuring and collecting data involves interpretation and value judgements. In Data Feminism, Catherine D’Ignazio and Lauren Klein challenge us to think deeper about the power dynamics behind data science. Whose interest is protected? Who gets heard? Who doesn’t?

Tableau consultant Bridget Cogley urged us to ask ourselves, what voices are being amplified in our field?

“We, as a profession, show our values in the stories we choose to tell, amplify, and not tell.” — Bridget Cogley

In this light, it's worth reflecting on the relative quiet within the DVS in response to Black Lives Matter and anti-racism protests, as compared to the massive rush to visualize COVID-19 data. A few questions come to mind: Is it a matter of data — knowing what reliable datasets are out there and where to access them? If it’s because the topics make people feel uncomfortable, why is it easier to abstain than to use dataviz to put a spotlight on injustices? Does this point more to the importance of diversifying our membership or the need for greater awareness that data-related practices like data science and visualization are a form of power?

The Double-Edged Sword of Data Visualization

There is an enormous role for data visualization practitioners to play. Nightingale Managing Editor Isaac Levy-Rubinett observed that, for better or for worse, 2020 seems to cement data visualization as uniquely suited for the social media age. In some instances, a visualization can go viral in an instant and convey at least as much information as a long-form article.

Business Intelligence (BI) consultant Luke Stonehouse and data visualization designer Soti Coker identified the Financial Times' evolving "Coronavirus Tracked" as one of the best visualizations of the year so far. Nightingale Editor-in-Chief Jason Forrest’s vote went to the ubiquitous "flatten the curve" chart. What these visualizations have in common was their ability to clearly introduce important and complex concepts in a compelling and accessible way.

Financial Times’ Coronavirus Tracked: Line chart showing the seven-day rolling average of new deaths from Covid-19

However, data visualization can be a double-edged sword. With bad charts, dataviz can become a catalyst for misinformation and lies. Soti flagged that it was as if the creators knew the data literacy levels of the public were quite low at the moment so they could take advantage of that.

Not all bad dataviz is intentional, of course, but that doesn’t mean the harm will be any lesser. In the case of visualizing COVID-19 data, DVS Operations Director Amanda Makulec pointed out there are real-life implications to inadvertently creating a misleading chart for public release. This is in addition to the risk of getting caught in the polarization tangle, at least in the United States, where even the act of wearing a mask can be a charged, political statement.

Data Literacy Isn’t a Silver Bullet

While data literacy is important, it’s an uphill task to get people to dig into the nuances of data interpretation. Some don’t have the skills or training to do so or feel intimidated by data; others simply don’t care.

Independent data visualization designer Jane Zhang shared her own struggle in tackling the issue of data literacy with family and friends. In her experience, those who didn’t care about literacy continually misunderstood the data or exaggerated what they saw reported in the news. It was not easy to bring up the need for careful evaluation in casual conversation.

Ultimately, data literacy is one piece of the puzzle. We have to confront the responsibilities we have as people who work with data. On the brighter side, the Data Visualization Society is more than a water cooler for dataviz enthusiasts to geek out. It is a great place for setting the tone on visualizing data responsibly and showing newcomer practitioners why we as a profession need to be more mindful of the second-order effects of our dataviz creations.

“There have always been pockets of people doing viz for a long time, but the DVS really brings it all together globally, as a great community that isn’t tied to one tool or technology base. “— DVS Member, Nicole Edmonds

What Can You Do?

  • Be mindful of what you’re choosing to viz. Consider how you can lend your dataviz skills to explore and expose critical, but understated, issues like mental wellness or civic demonstrations (check out Mass Mobilization Project data for instance). For those who still want to viz about high-risk topics like the COVID-19 pandemic and share it openly, Amanda Makulec’s tip is to use datasets from countries with robust testing, those that have effectively “flattened the curve,” and where there are strong national information systems to capture timely, complete, and accurate data. Collaboration with subject matter experts (or at least investing time to learn from them) is critical for developing accurate visualizations of data about complex topics. If you’re unsure, reach out to the DVS community and ask for advice.
  • Invite others to question your data and share alternative views. Echoing the experiences of many, Data Science student Ben Xiao shared that he gains a lot when getting feedback from fresh eyes on his visualizations. Catherine D’Ignazio pushes us to take it one step further and “make dissent possible” during the design process. The idea is to include more voices and take a collaborative approach to the interpretation of the data we use. In a similar vein, Bridget Cogley suggested having more discussions on data lineage or how the data came to be so that we can better pinpoint the biases that get introduced along the way and confront them.
  • Learn more about visualizing uncertainty. Most visualization techniques were created with the assumption that the data are free from uncertainty. Yet, this is rarely the case. There are times we need to make explicit the inherent uncertainty in the data so that our viz can be taken in good faith. The good news is that there are a growing number of resources to help with this challenge. For example: Nathan Yau of FlowingData has a cheat-sheet on the options available, while Dr Lace Padilla, Assistant Professor in Cognitive and Information Sciences, and her coauthors recently reviewed best practices based on how the mind processes different types of uncertainty.
  • Stress-test your viz for possible misinterpretations. Put yourself in the shoes of your audience and consider how much dataviz skill or data literacy is required to understand your work. Consider how you can make the visualization more user-friendly by adding elements like annotation and color cues. It’s even better if you can get opinions from a diverse range of audience. The DVS slack has a dedicated channel #share-critique for receiving feedback, use it!

These steps are likely to push us out of our comfort zone. In 2020, we saw many new wonderful initiatives like DVS Office Hour and the Find a Dataviz Buddy system to help upskill the community. We can expand what it means to be proficient in data visualization: recognizing how culture and its associated biases are embedded in the data, knowing when to visualize and when not to, etc.

It won’t be easy, but cultivating such core non-technical skills are especially critical when we create visualizations outside of our domain expertise or use data on sensitive topics. These same skills will mark the professionalization and maturation of the dataviz profession.

Many thanks to DVS members who contributed to the discussion that made this article possible: Amanda Makulec, Ben Xiao, Bridget Cogley, Isaac Levy-Rubinett, Jane Zhang, Jason Forrest, Luke Stonehouse, Max Graze, Nicole Edmonds, and Soti Coker.

Alexandra Khoo is a policy strategist turned cultural data scientist. As part of the tech team at Synthesis, she builds and layers bespoke datasets to give a fresh take on people’s behaviors, passions, and values. She also enjoys doodling imaginary creatures.

--

--