For more information on the novel coronavirus and Covid-19, visit cdc.gov.

Ten Considerations Before You Create Another Chart About COVID-19

To sum it up — #vizresponsibly; which may mean not publishing your visualizations in the public domain at all

Amanda Makulec
Mar 11 · 9 min read

Teams are making ready-to-use COVID-19 datasets easily accessible for the wider data visualization and analysis community. Johns Hopkins posts frequently updated data on their github page, and Tableau has created a COVID-19 Resource Hub with the same data reshaped for use in Tableau.

These public assets are immensely helpful for public health professionals and authorities responding to the epidemic. They make data from multiple sources easy to use, which can enable quick development of visualizations of local case numbers and impact.

At the same time, the stakes are high around how we communicate about this epidemic to the wider public. Visualizations are powerful for communicating information, but can also mislead, misinform, and — in the worst cases — incite panic. We are in the middle of complete information overload, with hourly case updates and endless streams of information.

As a public health professional, might I ask:

Please consider if what you’ve created serves an actual information need in the public domain. Does it add value to the public and uncover new information?

If not, perhaps this is one viz that should be for your own use only.

If, after reading all of these caveats and warnings about the harm and panic that can be caused by misleading visualizations, you’ve decided to explore and visualize data about COVID-19, here are ten considerations for your design process.

Ten Considerations when Visualizing COVID-19 Data

1. Do more to understand the numbers than just downloading and diving right into a dataset.

The available data on COVID-19 cases isn’t a dataset to go on autopilot with and play around with different chart concepts, particularly if you plan to publish for the public.

Review resources about COVID-19 and the SARS-CoV2 (the novel coronavirus that causes the disease). Start with the CDC dedicated response page and explore more on the Johns Hopkins Coronavirus resource page.

It’s a good practice to always understand the context of the data you’re working with, but is essential when creating and sharing visualizations during an epidemic where those visualizations have the potential to incite panic just as much as they have the potential to inform.

2. Case numbers are the most readily available, thorough, routinely updated data sources, but that doesn’t make them simple to visualize.

You can find case numbers from primary sources (e.g. CDC, Ministries of Health, state departments of public health, and other agencies collecting the data) and aggregated data sets (e.g. the dataset underlying the Johns Hopkins COVID-19 operations dashboard).

Case numbers seem to lend themselves well to maps and have the advantage of being very local for answering the question, “Are there cases near where I live / where I traveled / where I’m thinking about going?” — but visualizations of these numbers can easily mislead.

Be clear on what kind of cases are represented — you can find case definitions on the WHO website. If you’re going the route of building a map, please review Kenneth Field’s detailed recommendations for mapping COVID-19 before creating your own. (See more specific notes on chart considerations in #5, below.)

Examples from Kenneth Field’s “Mapping coronavirus, responsibly.” Map at left represents a flawed example, as compared to the version at right which correctly utilizes rates as opposed to totals and “…without sensationalist colour choices.” (Source)

3. Aggregations and calculations that can be done with the case data are not necessarily what should be done with the case data.

Tableau and other tools make it easy to quickly create charts, graphs, and maps, as well as to run calculations with those numbers. It’s also common practice in data visualization to create benchmarks or comparisons between groups and countries in our work. However, when visualizing COVID-19 data these calculations need to reflect the basic principles of epidemiology.

There are nuances in the definitions of different kinds of cases (including COVID-19 definitions) which affect whether they can be aggregated or not. In public health, there are calculated metrics — such as case fatality rate — with very specific definitions that are used to understand and monitor disease spread and human impact. Just because you can perform a mathematical function on a set of health statistics doesn’t mean you should.

For example, one chart shared about COVID-19 summed the total deaths to date and divided it by the known days in the epidemic to create a special disease deaths per day aggregation. Then, that number was calculated for other major diseases for comparison. At best, this is an inaccurate comparison due to major differences in our knowledge of and resources for testing and treatment of COVID-19 compared to other diseases. At worst, it significantly understates the seriousness of COVID-19 and causes people to ignore the advice of public health professionals on social distancing and other individual actions that can slow the spread of the virus.

Finally, determining the share of the population infected or the share of infected persons who die from the disease are incredibly challenging calculations due to uncertainty in the denominator. Proceed with extreme caution when calculating any rates, and, better yet, please leave the rate calculations to the epidemiologists.

4. Be cautious when making generalized predictions or comparisons based on regionally specific data.

Many factors affect the spread and impact of the virus — such as the measures taken by a government to combat the spread and underlying population demographics.

Because of these differences, consider what is implied when making comparisons between countries with very different population sizes, political environments, and public health systems.

For example, the population of Italy skews older than that of China or the US. Because elderly populations have been identified at higher risk and are more likely to require hospital care, the percentage of cases requiring hospitalization may be higher in Italy than in countries with a younger population. (More on the ways demographics are influencing outcomes in Italy.)

5. Visualizations should inform and be honest about what isn’t represented.

There is much uncertainty in the data we have, particularly when trying to extrapolate to a general population. With an emerging disease, disaggregating and looking at cases and rates in sub-populations can help us to better understand the disease.

The number of confirmed cases is only a subset of infected persons in the population, and the number is impacted by health seeking behavior (if I’m sick, do I go to the doctor?), test kit availability (if I go to the doctor, can I get a test?), health systems factors, and other considerations.

COVID-19 is not a death sentence, and our visualizations need to reflect that. Including ‘recovered cases’ is an essential piece of context in visualizing case numbers.

Reiterating here: calculating rates — like the case fatality rate — is challenging without an accurate denominator. Leave the rate calculations to the epidemiologists.

6. Epidemiologists and public health agencies create complex models to understand how the disease may progress.

These data are likely not going to feed into a dashboard, but sometimes get cited and sourced in static charts and graphs. The benefit of using results from models from WHO, CDC, and other public health experts is that they typically go through some level of peer-review before being published.

Proceed with caution if incorporating these numbers in a visualization though: models are complex, as they try to account for the behavior of the virus, human behavior, and systems factors. As a result, models will change. If you use data from a model, document the inputs and sources thoroughly.

7. Data scientists and statisticians have also been publishing their own models and related conclusions about disease projections.

Use these with caution in framing your visualization and analysis unless they are well sourced, documented, and explained. **Preferably validated by an epidemiologist or someone else with related expertise.**

Modeling disease is complex (see #6). Rough, “back of the envelope” calculations can be more fear-inducing than helpful.

Instead, rely on well-sourced models from public health agencies and experts.

8. Make thoughtful design decisions.

Still committed to creating a visualization about COVID-19? Read existing resources on responsible visualization approaches in this context before publishing any charts or maps.

Datawrapper has an excellent set of responsible visualizations of COVID-19 with notes on the design decisions they made.

“What we considered when making these visualizations” from the awesome team at Datawrapper (Source)

You can also read this excellent thread of recommendations and critiques on visualizing COVID-19 from Evan Peck.

9. Consider the human side of what you create.

Reference terms correctly (see WHO definitions for COVID-19 cases, an explainer on R0, and the CDC Glossary as resources) and clearly define each metric for your audience somewhere in the visualization — that can be a footnote, title, subtitle, annotation, explainer text…just make sure it’s there.

Be considerate of the language you use in your visualization.

Remember that behind every data point is a person in a COVID-19 dataset. If you wouldn’t feel comfortable having someone from a high risk group read what you wrote, please revise.

10. Consider how visualizations can impact (and encourage) social responsibility as we see COVID-19 in our respective communities.

Self quarantine where appropriate. Ensure we’re not stigmatizing people who are from countries and regions that have had a lot of cases. Understand what additional steps you can take to flatten the curve and slow the spread of the virus in your community.

Esther Kim and Carl Bergstrom (Source)

And finally, consider visualizing other relevant data about impacted communities if you don’t feel you have the public health knowledge to add to the conversation around COVID-19 cases. Epidemic data isn’t a dataset to play with just to have something to show off on Twitter.


Key Resource Links

These are linked throughout this document, but called out here for ease of discovery and review:

Editor’s note: Since its publication, this article has been translated into Thai and Portuguese.

Amanda Makulec is the Senior Data Visualization Lead at Excella and holds a Masters of Public Health from the Boston University School of Public Health. She worked with data in global health programs for eight years before joining Excella, where she leads teams and develops user-centered data visualization products for federal, non-profit, and private sector clients. Amanda volunteers as the Operations Director for the Data Visualization Society and is a co-organizer for Data Visualization DC. Find her on Twitter at @abmakulec

Many thanks to Alyssa Bell for her speedy and brilliant editing for clarity.

Nightingale

The Journal of the Data Visualization Society

Amanda Makulec

Written by

Data viz designer and enthusiast for using data for social good and public health. MPH. Operations Director @datavizsociety and Data Viz Lead @excellaco.

Nightingale

The Journal of the Data Visualization Society

Amanda Makulec

Written by

Data viz designer and enthusiast for using data for social good and public health. MPH. Operations Director @datavizsociety and Data Viz Lead @excellaco.

Nightingale

The Journal of the Data Visualization Society

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store