How To Put Coronavirus Into Context

What Can 105 Years of Mortality Data Teach Us About Risk?

David Foster
Applied Data Science
11 min readMay 29, 2020

--

1. What’s This About?

We’re going to find out which week of the last century carried the same mortality risk as the peak week of the current coronavirus epidemic, in the following 6 European countries:

  • England & Wales
  • Sweden
  • Spain
  • Belgium
  • The Netherlands
  • Denmark

Why these 6 countries?

These are the countries that have publicly available mortality data spanning at least the years 1911-2016 — a long enough timeframe to conduct a contextual analysis of the current pandemic. They also have publicly available weekly mortality data covering the last decade, from 2010–2020. They cover a wide range of epidemic severities and governmental responses to COVID-19.

Why 105 years?

To establish the significance of the current pandemic, it is crucial that we analyse its effect on total deaths across a timeframe that is long enough to assess the precedence of the spike. Usually deaths are compared to the the average across the last 5 or 10 years — for example the chart below shows weekly deaths from all causes for England and Wales compared with 10 previous years of data.

Total Deaths (England and Wales). Red line = 2020; Grey lines 2010–2019. Data source: ONS.

Whilst this is a perfectly valid way to highlight the short-term abnormality of the virus, it is also crucial to understand how far back we need to go before the mortality risk is no longer unprecedented.

2. Data

All data used in this analysis is sourced from the Human Mortality Database; an excellent publicly available source of detailed mortality and population data.

Let’s first define some key terms that we’ll be using throughout the analysis.

For each year, there are two important quantities to measure - the number of deaths and the total size of the population. This data is available by age group, where each group is 5 years wide e.g (1–4, 5–9,…,85–89). There is also a group for <1 and one for 90+.

Mortality Rate

A mortality rate is a number of deaths per 100,000 people. This can be calculated per age-group and per year — for example, the age-specific mortality rates for England and Wales from 1960 and 2010 shown below.

Mortality rates for England and Wales (civilian deaths) by age group (1960 and 2010). Raw data sourced from mortality.org.

We can see that for most age groups, the mortality rate more than halved between 1960 and 2010. For example, if you took a group of 100,000 75–79 year olds from 1960, around 8,055 would die within the year. By 2010, this number was 3,600, a fall of 55%.

You might therefore think that measuring overall mortality rate is a good way to judge healthcare in any given year. However, if you plot the mortality rate for the population as a whole over time, you find that it stays roughly the same between 1920 and 1980.

Mortality rate for England and Wales (civilian deaths). Raw data sourced from mortality.org.

This is because as healthcare improves, the rate of deaths doesn’t necessarily change drastically; it’s just that they’re shifted to the older population as people live longer.

To give an example, which of these two populations would you say has better healthcare?

Population 1 — 100,000 people, where everyone dies before reaching 60 years old and 1000 people die every year (mortality rate = 1000)

Population 2 — 100,000 people, where 50,000 people are under 60 and 50,000 people are over 60. 200 of the under 60s die every year and 800 of the over 60s die every year (mortality rate = 1000)

Clearly, Population 2 has better healthcare, all other things being equal, but the mortality rates are the same.

So what metric can we use to judge healthcare improvements, that takes into account the fact that the age distribution of the population will change over time?

Age Standardised Mortality Rate

The Age Standardised Mortality Rate (ASMR) is the weighted average of the age-specific mortality rates, where the weights are the proportions of persons in the corresponding age groups in a given reference year — in our case we’ll use 2016.

In other words, we calculate the mortality rate for a given year as if the population age-distribution is the same as 2016, so that we can make a like-for-like comparison.

Specifically, the ASMR is calculated as follows:

where M_i is the mortality rate of age group i and P_i is the proportion of age group i in the reference population.

If we plot the ASMR over time, a different picture emerges:

Age Standardised Mortality Rate for England and Wales (civilian deaths). Raw data sourced from mortality.org.

The impressive linear fall of ASMR is a testament to the incredible improvements in both healthcare and lifestyle across the century. We are medically better equipped than ever to tackle the biggest killers (heart disease / cancer) and are living healthier lives (less smoking, greater emphasis on fitness) than ever before.

This metric allows us to place the current coronavirus outbreak into context, for the 6 European countries where data is available back to 1911.

3. Charts

We’re first going to look at three key charts for England and Wales — the charts for the other five countries are given at the bottom of this article.

England and Wales

1. Age Standardised Deaths by year (1911–2016)

Age Standardised Deaths are what you get if you scale the ASMR to the population size in 2016. In other words, this is an estimate of the number of deaths in each year, given that the age distribution and size of the population was the same as in 2016. This falls from 1.7 million in 1911 to 525,048 in 2016.

2. Total Deaths from all causes by week (2010–2020)

This chart shows the short-term impact of the virus on overall deaths in comparison to data from the last decade. At its peak, three were 22,351 deaths in a single week, over 6,000 higher than the second highest peak, during the flu season of 2015.

3. Age Standardised Deaths (average and + 1 standard deviation) by week (estimated; 1911–2016)

We can use the weekly mortality dataset from the last decade to find the average proportion of deaths in each week of the year. Under the assumption that this remains constant over time, we can use these proportions to calculate an estimate for the age-standardised deaths per week back to 1911.

By looking at the spikes in this time series, we can then estimate which week from the last century presented an equivalent risk to the population as the current pandemic at its peak.

The chart above shows the estimated weekly fluctuations in age-standardised deaths by year. The left hand chart shows the expected average number of age-standardised deaths per week and the right hand chart shows this number if the proportion of deaths is one standard deviation higher than the average — i.e. a particularly ‘bad’ peak week, which happens about every 6 or 7 years. The red horizontal line across this chart shows the peak number of weekly deaths from the current coronavirus epidemic.

The first week where the age-standardised deaths is comparable to that of the current epidemic is week 2 of 1985. We can also see that a ‘bad’ flu week in 1993 would also carry a similar amount of risk.

4. Analysis

Let’s be clear on what this means. In 1985, healthcare was not as advanced as it is today. Many more people smoked — 33% compared to 14.4% today. There was far greater poverty — around 50% lived in a low income household, compared to 20% today, as growth in incomes has outstripped inflation. Cancer treatments were not fast tracked in the 1980s — many waiting several months for a referral to start treatment. Statins were introduced to fight cardiovascular disease in the 1990s and became commonplace in the 21st century. They are now the most commonly prescribed medication in the UK and have cut the risk of dying from heart disease by 28% in men.

So 1985 was dangerous, for lots of reasons. Add a flu season into the mix and things were even more dangerous. The same can be said for 1993, though a particularly bad flu week would be required to reach the same levels of risk as a normal flu season in 1985.

All of these reasons add up to a risk that is comparable to the peak of the current coronavirus epidemic. That is, on average, a person living during the peak of the 1985 flu season experienced about the same absolute risk as a person living during the peak of the 2020 coronavirus epidemic.

5. Interpretation

It is clear that our interpretation of the severity of the coronavirus pandemic is heavily dependent on the timescale over which we choose to analyse it.

If we zoom in, there is no question that there is greater risk of death due to the coronavirus outbreak, relative to other years in the recent past. Simply looking at the last decade of data for each country is enough to draw this conclusion with confidence.

However if we zoom out, we see that the risk of death during the current epidemic at its peak is comparable with that of the peak flu week in 1985. In other words, 1985 healthcare and lifestyle conditions in combination with the natural peak in the flu season created about the same absolute risk of death as the current coronavirus outbreak at its peak, with our improved healthcare and healthier lifestyle choices.

Relativity

In essence, this is a question of relativity, as all risk is relative.

Those who see only the short-term spike in deaths argue that the obviously heightened risk of death compared to the last decade of data proves the severity of the virus. Through a near-focus lens, they are right.

On the other hand, there are those who argue that we have lived through much riskier times that this without lockdown policies, often pointing to the 1968–69 Hong Kong flu as evidence. Through a long-range lens, this is also true.

Whilst these two statements of belief appear contradictory, they are not.

We can believe simultaneously in the heightened short-term severity of the virus and all the strain on health services that this brings as well as understanding that not so long ago, the environment in which we used to live was every bit as risky as the current situation. We just didn’t feel it at the time, because each year felt very much like the previous year and we are simply not programmed to accurately assess risk across timeframes of decades. Indeed, it could be said that we collectively struggle to take proper action on climate change for exactly the same reason, but that’s a subject for another post.

6. Summary

We have explored a methodology for assessing which week in the last century carried comparable risk to the peak week of the current coronavirus epidemic, in six European countries. The summary of the findings is outlined below:

Summary of findings: A ‘bad’ peak flu week is a week where the proportion of yearly deaths in this week is 1 standard deviation above the mean. The Denmark mortality data does not display a notable peak for the current coronavirus epidemic, so the risk is no greater than would be otherwise expected.

England & Wales, Belgium and The Netherlands return similar results — a peak flu week in 1985 being the year that carried a comparable risk to the current peak of the epidemic for each country. In Spain, the comparable year is pushed slightly ahead to 1991 and in Sweden considerably ahead to 2002. This is mostly due to the fact that the Swedish coronavirus peak is not as pronounced as the other three countries. Denmark has not yet experienced a peak in deaths as a result of the epidemic.

We have seen that there are many ways to interpret such results and we must take care to hold both the short and long term views of risk in our thoughts as we grapple with the complexity of the situation. Maintaining an open and inquisitive mind is key. It is the only way to ensure that these strange times continue to have more in common with 1985 than 1984.

Thanks for reading — if you enjoyed this article, please do leave some claps 👏👏👏!

Sweden

1. Age Standardised Deaths by year (1911–2016)

2. Total Deaths from all causes by week (2010–2020)

3. Age Standardised Deaths (average and + 1 standard deviation) by week (estimated; 1911–2016)

Spain

1. Age Standardised Deaths by year (1911–2016)

2. Total Deaths from all causes by week (2010–2020)

3. Age Standardised Deaths (average and + 1 standard deviation) by week (estimated; 1911–2016)

Belgium

1. Age Standardised Deaths by year (1911–2016)

2. Total Deaths from all causes by week (2010–2020)

3. Age Standardised Deaths (average and + 1 standard deviation) by week (estimated; 1911–2016)

The Netherlands

1. Age Standardised Deaths by year (1911–2016)

2. Total Deaths from all causes by week (2010–2020)

3. Age Standardised Deaths (average and + 1 standard deviation) by week (estimated; 1911–2016)

Denmark

1. Age Standardised Deaths by year (1911–2016)

2. Total Deaths from all causes by week (2010–2020)

3. Age Standardised Deaths (average and + 1 standard deviation) by week (estimated; 1911–2016)

Applied Data Science Partners is a London based consultancy that implements end-to-end data science solutions for businesses, delivering measurable value. If you’re looking to do more with your data, please get in touch via our website. Follow us on LinkedIn for more AI and data science stories!

If you enjoyed this article, please do leave some claps 👏👏👏!

--

--

David Foster
Applied Data Science

Author of the Generative Deep Learning book :: Founding Partner of Applied Data Science Partners