COVID-19 Data Literacy is for everyone, comic cover image

COVID-19 Data Literacy is for Everyone

Alexandra P. Alberda
Nightingale
Published in
8 min readMay 5, 2020

--

Our pandemic lives are deeply entwined with data visualizations.​ From instructional hand-washing infographics, to calls to ‘flatten the curve,’ data visualisations are telling us how to live, and predicting our possible futures. As the cascade of open data relating to the COVID-19 virus grows, so too do the charts and graphs claiming to decipher, decode, and translate this data for everyday understanding.

In response to this data visualisation of our everyday lives, designers and data storytellers are working hard to fight graphics that represent ‘fake news’ and educate journalists, analysts and commentators to create better data visualisations.

We created this webcomic to share some of their work and help empower audiences to better understand the COVID-19 data visualisations that now fill our everyday lives.

COVID-19 Data Literacy is for everyone, comic cover image
Caption box: This webcomic was co-curated by…
Image of a woman with black hair. Text: Anna Feigenbaum, Bournemouth University, Co-Author of The Data Storytelling Workbook
Image of a woman with long brown hair. Text: Aria Alamalhodaei, Co-Author of The Data Storytelling Workbook
Image of a woman with brown and blonde hair. Text: Alexandra P. Alberda, research illustrator.
Caption box: why data visualisation literacy matters
Knowing how to read the data presented to us can be very empowering, giving us back some sense of grounding and control.
Image of a confused person floating with data visualisations and questions marks swarming around them.
Image of the previous person now standing on a rocky ground and the data visualisation are now contained in thought bubbles.
When we have a better understanding of risk it can lead to more effective wellness behaviours and can help us feel empathy.
image Person A: I’M SCARED BECAUSE OF MY ASTHMA. Person B: I KNOW. BUT YOU’RE NOT 65 SO YOU SHOULD BE FINE. BE BACK IN A BIT.
same image Person A: I’M SCARED BECAUSE OF MY ASTHMA. / Person B: I KNOW. LET’S PUT TOGETHER A PLAN THAT HELPS YOU FEEL SAFE.
How we represent people, cities and countries in the media has effects in the real world like Hate crime and discrimination.
two images SOCIAL PHYSICAL DISTANCING SPACE GIVEN FOR SAFETY STIGMATISED EMOTIONAL Distancing space given for fear & violence
Image of people now holding hands. Caption: informed emotional distancing space given for empathy, kindness, support healing,
Caption box: Introduction to COVID-19 data communications
The coronavirus epidemic has generated a huge amount of data. This data is coming in from all over the world, in real time.
However, because this informatin is being collected from so many different sources the aggregated data is complex and messy.
Image of woman looking at a single graph on her phone. behind the phone many lines attach to different icons of data sources.
In efforts to make this data more accessible, scientists are sharing their analyses with the public.
This open data provides an incredible resource for those trying to understand things like how COVID-19 spreads and risk.
image of a female doctor is protective gear working in a lab.
image of John Hopkins and the COVID Tracking project logos. Text under saying: along with the CDC and others
Image of different COVID-19 news headlines. Text saying: news organisations are also trying to translate this…
text saying: complex and messy data through the use of data visualisations. image of a hand holding a phone with a news graph
caption box: but with all this data circulating…
how much do we really understand about what we are seeing?
how does the way data is presented influence our thoughts and behaviours?
what information might be missing or be misrepresented?
Data literacy people are asking these questions since this data is only useful if we know how to read it.
caption box: what’s a numerator and denominator?
Many of the statistics we see to describe rates and risk factors have two main parts, a numerator and a denominator.
Bar graph of how confirmed cases of coronavirus have spread as of April 7 in ten countries. total cases is 1,359,398.
The numerator is what is being counted. Here it is the number of people with COVID-19 on April 7 on the graph: 1,359,398.
The denominator is the total population the numerator is being counted from. This is missing from the previous graph and many
Caption box: why denominators matter
In order to understand fatality rates people need data on the number of deaths, numerator, and total population, denominator.
Two comparing bar graphs showing Seasonal Flu fatality rates in US 2018–19 and COVID-19 up to Feb 11 by ages categories.
Same graph annotated saying: Out of all known cases of COVID-19 in people 60 plus, 6 percent died.
At first it looks like the death rate is 6% but this doesn’t include all cases and the whole population being counted from.
Caption box: what’s the problem?
Not all cases of COVID-19 are being reported so we are only getting an inaccurate understanding of rates and populations.
This is particularly the case in countries were testing is primarily done in hospitals only with people showing severe signs.
Image of a woman with silver hair: Professor Sheila Bird from Cambridge University.
If you never actually develop symptoms but had encountered the virus, that would be an infection, but it is ‘uncountable’ .
In order to make sense of statistics we need to know from the government on how many people are tested each day.
Caption box: bring us denominators!
Image of a man with black hair: Randy Au, Data Nerd Scientist and writer for Medium’s Towards Data Science Journal.
People are interested in denominators because it matters a lot right now whether the rate implicates our future mortality.
People are getting a firsthand look how messy the process is while trying to get an impossible ‘true number’ or measurement.
Caption box: what’s in a numerator?
People are interested in numerators or features being counted. This data on deaths is often divided by age in the news.
Bar graph of over half of deaths among over 80s. Deaths in hospitals in England, as of April 6. Age 80 plus deaths at 2,554.
Some show other variables. the more we know numerators we can understand risk, like why some data says men are more at-risk.
Image of a man with brown hair: Philip Ball, Science writer.
It is best to look at these figures not as exact but as general trends and question how deaths are being recorded.
Scientists don’t know why men seem more at-risk but this could be due to other conditions, behaviour or environmental.
But it is important to separate data, like by sex, to help better understand causes & responses, advises World Health Org.
Image of a man with buzz cut: Andy Kirk, Data Visualisation Trainer.
Knowing pre-existing conditions is important context but it is important to publish data when you know the context it is in.
Caption box: why population size matters
We are seeing a lot of maps that show big dark bubbles or shaded ares of COVID-19 confirmed cases and deaths.
Image of a blonde woman with purple & pink highlights: Catherine D’Ignazio, MIT Professor and Coauthor of the Data Feminism.
THAT FORMAT, KNOWN AS A CHOROPLETH MAP, MEANS HIGH-POPULATION STATES LIKE CALIFORNIA WILL APPEAR WORSE THAN SMALLER STATEs.
Map of the United States colored shades of purple to show states by population density.
Same Map of the United States colored shades of purple now to show states reporting cases of COVID-19 to CDC.
These representations don’t show healthcare infastructures of a place. So a location with fewer cases inaccurately seem safer
This issue of healthcare capacity is why governments are telling people not to travel to rural country homes or camping sites
Caption box: using data visualisation to dramatise the pandemic
It is not only the visualisation of numbers that matters, but also the ways they are presented visually.
Bar graph titled “Death rate varies by age, health and sex” breaking down each of these categories. Axis only goes to 15%.
Image of a man with silver hair: Andy Cotgreave, Technical Evangelist at Tableau and a Columnist for Infoworld.
At first we percieve this bar graph as going to the maximun and imply 100% of 80 year olds who get the disease will die.
But this is not a mistake. By truncating the axis to 15%, the reader can easily compare one category against another.
However shortening the axis contributes to alarmist readings. By extending the axis to 100% the data is more clearly shown.
Same bar graph title changed to “what percent of people who contract coronavirus dies est.?” with axis now at 100%.
Another way to visualise this data, might be to show survival rates. Quote attributed to Andy Cotgreave.
Previous bar graph flipped to show survival, titled What % of people survived in 44,000 cases of COVID in China, axis at 100%
Another important element of data viz literacy is how different visualisation change the story and reader emotional response.
Another element to analyse is color. Red is used a lot in COVID-19 data visualisations but this color has its own story.
Image of a man with light brown hair and a 5 o’clock shadow: Kenneth Field, Cartographer.
Map of China showing Coronavirus cases in each province, as of February 24, in different shades of red.
People like red maps but is this the best color when the dataset is recording deaths in a worsening human health tradegy.
By changing the previous map to blue-green we show the same data but present it more sympathetically, explains Kenneth.
Caption box: projections are not an exact science
Many of the visualisations show future outcomes. this is based on avaialbe data, past trends and evidence based assumptions.
Steep curved line graph showing recorded and projected COVID-related deaths in UK and US for 2020, titled simulation shock.
Because of differences in applying these sources projections can look very different even within research teams’ models.
In addition different statistical & machine learning techniques are used to create projections, says Angus Loten WSJ writer.
Outbreak analytics seeks to gather all available data on an epidemic though these can come from many different sources.
This raw data is then processed by machine learning software and used to predict new cases within a given population or other
Image of a woman with long brown hair: Zeynep Tufekci, Digital Scholar.
This is why Zeynep Tufekci says, “Don’t believe the COVID-19 models. That’s not what they are there for.”
Neil Ferguson, mathematical epidemiologist Imperial College London & team behind projections informing governments’ policies.
“We’re building simplified representations of reality. Models are not crystal balls…” says Neil Ferguson.
Instead of thinking about projections as crystal balls that show us our destiny we can consider them as Zeynep Tufekci says:
As a way to see out potential futures ahead of time, and how that interacts with the choices we make today.
Image of person at a crossroads, to the left crowds under a steep line graph to the right social distancing & flattened curve
Caption box: conclusion
People in these fields are asking governments to be open about the nature of the data they are using & provide its backstory.
Likewise, when visualising data it is important to think about how visual decisions change narratives and impact audiences.
Data visualistations carry an aura of certainty in their design and reputable sources of data all convey authority…
D’Ignazio says, “but in situations like this, those conventions are doing us a disservice.”
Data visualisation experts are offering top tips for creating better visuals, such as Nightingale, ESRI, & Chartable to help.
Caption box: read more about data visualisation literacy:
Image of The Data Storytelling Workbook cover, by Anna Feigenbaum and Aria Alamalhodaei.
Image of Data Feminism cover, by Catherine D’Ignazio and Lauren F. Klein.
Image of Data Visualisation A Handbook for Data Driven Design cover, by Andy Kirk, second edition.

Thank you so much for reading! This is a living webcomic so please feel free to send feedback and ideas.

Get in touch with Alex at aalberda@bournemouth.ac.uk Alex is a PhD candidate in Graphic Medicine and research illustrator. Or Anna at afeigenbaum@bournemouth.ac.uk Anna is an Associate Professor in digital storytelling and co-author of The Data Storytelling Workbook (Routledge 2020)

FURTHER READING AND SOURCES:

What’s a Numerator and Denominator?

Epidemiology terminology: https://www.healthknowledge.org.uk/public-health-textbook/research-methods/1a-epidemiology/numerators-denominators-populations

What’s the Problem?

Why rates differ: https://www.bbc.com/future/article/20200401-coronavirus-why-death-and-mortality-rates-differ

How rates will affect the UK lockdown: https://news.sky.com/story/coronavirus-the-four-factors-that-will-decide-when-the-uks-lockdown-can-end-11969844

Bring Us Denominators!

Randy Au’s work on data literacy: https://towardsdatascience.com/data-literacy-via-covid-19-38965538f390

What’s in a Numerator?

Behind Italy’s death rate: https://www.telegraph.co.uk/global-health/science-and-disease/have-many-coronavirus-patients-died-italy/

BBC postcode and COVID-19 map: https://www.bbc.co.uk/news/uk-51768274

Coronavirus and male risk (Philip Ball): https://www.theguardian.com/commentisfree/2020/apr/07/coronavirus-hits-men-harder-evidence-risk

Communication themes and COVID-19 (Andy Kirk): https://www.visualisingdata.com/2020/03/communication-themes-from-coronavirus-outbreak/

Gender data gaps (WHO): https://www.who.int/activities/closing-data-gaps-in-gender

Why Population Size Matters

Calling out COVID-19 misinformation: https://www.wired.com/story/professors-call-bullshit-covid-19-misinformation/

Data Feminism book (D’Ignazio): https://www.amazon.com/Feminism-Strong-Ideas-Catherine-DIgnazio/dp/0262044005

CDC US cases update: https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html

Using Data Visualisation to Dramatise the Pandemic

Deconstructing COVID-19 data viz (Andy Cotgreave): https://gravyanecdote.com/visual-analytics/coronavirus-be-wary-how-we-visualise-data/

Responsible data vis (Kenneth Field): https://www.esri.com/arcgis-blog/products/product/mapping/mapping-coronavirus-responsibly/

Projections Are Not an Exact Science

Projections and COVID-19 spread (Angus Loten): https://www.wsj.com/articles/scientists-crunch-data-to-predict-how-many-people-will-get-coronavirus-11584479851

Models and projections (Zeynep Tufekci): https://www.theatlantic.com/technology/archive/2020/04/coronavirus-models-arent-supposed-be-right/609271/

Simulation Shock graph source: https://www.nature.com/articles/d41586-020-01003-6

--

--

Alexandra P. Alberda
Nightingale

Research Illustrator, PhD candidate in Graphic Medicine and public engagement, Research Assistant for Civic Media Hub. Twitter: @ZandraAlberda