A how to for global C19 comparisons

Simon Nicholls
Pragmapolitic
16 min readJan 20, 2021

--

I’ve been experimenting with some short-thread-form articles on twitter, like this, to limited success. The most annoying part being typos as they’re a total pain to correct, that said, being objective about my own small place in the world, these long forms are hardly going viral.

We’ve been blighted since the start of the pandemic by, opinion devoid of tools to help us devine true meaning, looking knowledgeable, without concrete analysis. Malcolm Kendrick tends to fall into this trap. His maths drew him to false conclusions last year in this article, and this latest attempt falls foul of selection bias.

It is long, but basically just orders the worldometer data by deaths per 1 million of the population (d/1m), notes all those that come to the top have had lockdowns, and concludes lockdowns must make things worse.

The problem is none of the conditions for susceptibility to deaths are equal between countries. Different factors exacerbate matters — e.g. policy choice, demographics, density, etc. Where these amplify each other, you’ll get worse outcomes. Those deaths will trigger govts to impose harsher policies — so exactly what he observes will be true, but not for the reasons he suggests. It’s the factors that drive it.

So the ranking of countries by death/1m is selection bias.

Something he fails to recognise. Worse, he goes to the extent of listing 27 factors, but dismisses any value in assessing, with data, the degree to which any may impact death rates.

In august last year I wrote this article attempting to cover this topic specifically for Brazil, it gained little traction, so I’ve taken the time to dramatically expand the dataset, to show some startling realities.

The hardest part is finding age specific deaths by country broken down in enough detail for comparison. In this case, 10yr age bands. As of Feb 6th 2021 I’ve managed to find data for:

  • 18 countries in all parts of the world
  • NY state separately to the USA

So, let’s start with Kendrick’s great reveal, deaths/1m country totals (in blue), here ranked as he did by deaths/1m.

Immediately commentators will say: “Well Brazil’s not faired as badly as we have, so measures make no difference, they’ve been far less restricted and it hasn’t mattered.”

They’ve given up too early, my new mantra “keep calm and do the data science”.

Remember the virus is spreading at different rates, firstly policy choice is going drive outcome, and this includes the impact of tools like test&trace.

So without accounting for factors that will cause big variation in outcome, we can’t truly start to see who got it right.

For me, the key to understanding what we need to do comes from looking at d/1m rates within each 10yr age band. The columns in red are those in the 50–59 yr band. What jumps out immediately is how different they are to the population average, and they appear to be random.

However, they’re not, to see clarity, it we sort the data by most to least aging population, and add in green the average below 60, clarity emerges.

The younger the population the greater the proportion of deaths <60 — e.g. in Brazil/Peru about 30% of all their deaths, about 3x that in Eng&Wal.

Now most people fail at this point by letting the tail wag the dog. By that I mean don’t start what you know about policy and impose it on outcome. Start with the outcome and infer what it must mean about the reality of policy.

For example, Peru’s d/1m in the 50–59 age bracket is twice that of its average, the UK’s 1/3. Even Brazil’s only matches it overall average. Bear in mind this is the oldest working age part of the population, so regardless of what we are “told” policies were, or were not, if they either had, more spread, worse healthcare outcomes, or they’re more susceptible. So the claim Peru had the “most draconian” lockdown, on hte ground, may simply not be reality. Logic would dictate with no furlough, worse healthcare than Brazil, to get more deaths in the young people have had to carry on living more than us, leading to far more young death.

Those who jump to citing mobility data for Lima at this point (whether mobile phones moved around as much, or the degree to which people used public transport), which shows this group of people did lockdown, need to ask themselves how many Peruvian’s relative to the UK have mobile phones, and the degree to which their mobility data measures a much smaller affluent middle class in Peru, not the working poor, versus the majority in the West.

What people are missing is this. Peru has 1/5 those 80+ relative to the UK. Now you might say this is d/1m, but to combining those together from different age groups, you end up weighting each 10yr band by relative proportion. Which means Peru’s overall d/1m number reflects a much younger age group. Basically, to get a similar number of Eng&Wal, they just had far more young death.

The crucial thing we’re all overlooking in that reality is it means that somewhere in the world, there was capacity for people, who are likely biologically very similar to us, to die in far greater numbers.

You’re right to feel lost in the lack of pattern with the overall total d/1m figures, so what is going on? Well, the rest of this article upicks this in depth, as a taster, we first need to consider demographic, after this the degree to which countries actually report deaths, then measures and density. e.g.

  • Germany: has an aging population, lower density, but by far the best measures in Europe at limiting spread, hence low numbers.
  • Belgium: is a slightly younger population than Germany or the UK, but is 50% denser than the UK, which is 2x denser than Germany, they also had far worse track & trace, but relative to Brazil/Peru like most of Europe has still had far more effective furlough, hence the worst d/1m in Europe.

But, let’s not ramble like Kendrick, let’s approach this more methodically, and to frame this in something concrete, along the way in highlighted blocks we’ll use Nigeria as a case study. They’ve had 8 d/1m. If we account for differences in their population, can we get anywhere near Italy at 1415?

So starting with demographics, is this the biggest factor?

Using the same oldest-2-youngest order, and adding population data for 5 more countries, and the world as a whole, here 25 populations by 10yr band.

Strikingly, more than half of Pakistan’s and Nigeria’s population are <20, which means expecting high numbers of deaths is silly.

The 50% line marks the median age, examples being:

  • Italy: 47
  • England & Wales: 41
  • The World: 30
  • Nigeria: 18

So what is likely happening to make overall d/1m figures look so similar in Brazil & Peru, is that they’ve had far higher rates of spread and death in young to make up for having far fewer in the old.

So how have deaths actually played out?

For the original subset of countries that we have detailed data for, here, are the proportions of deaths in each age band.

Immediately the global reality becomes clear. Peru’s 82% of deaths <80, 57% <70, 30% <60, whereas the we’ve 39% <80, likely as furlough has been far more successful in the young, and all our deaths have come from the old.

Places like Ukraine, NYC, Moldova, Brazil and Peru, with fewer old people, have had fewer deaths from those bands, but places like Norway see like odd exceptions? We’ll see a little later, that having far fewer deaths makes places like these far less comparable.

So it seems to be correlated, but, does it explain all the difference? And…

… why does age matter so much?

Serology studies have sampled IgG antibodies to look for signs of prior infection in all ages, from these, and measured deaths, we can estimate as studies like this and this have done the IFR in each 10yr band, or as we’ll refer to the IFRa. Here are the averages I use that are similar, but come from averages across this, and other studies.

Now, for sure, it is possible, and there is no proof yet, that there are people with IgA/Tcell only resistance, but unless these are of markedly different proportions in each population and age, we can still work with the data we have, to see how the differences in population count in each age band will influence that population’s capacity to expect deaths in this subset of more severe infections.

e.g. Age Band Pop. Count x IFRa, added up will give an estimate of that population weighted IFRp of possible deaths with equal proportion infection.

Given we think IgG studies show the most serious group of infections, the variation should hold true, and once the proportion of lesser IgA/Tcell resistance proves true, we can adjust these estimates accordingly.

Meaning at least in the interim we’ll have a better guide of what to expect to play out in each country.

So what do these IFRp estimates look like?

Here they are in blue, Italy’s is the highest at 0.9% due to have the largest aging population, but what is most striking is that Nigeria’s is just 0.09%.

So, without accounting for this level of difference in comparing total d/1m figures between two countries we are going to go very wrong.

The red bars, where I have granular data, is the IFRd for each country, this is the wgted mortality rate for the actual deaths. This is not the same as the CFR, it is calculated using the IFRas, but instead of the population count in each band, you imply cases for each band, add them all up then use this as the denominator for all deaths to get the IFR for deaths you actually had.

The red bars are the total for all time, but the same calculation can be used to looking at how this changes week on week, with this plot showing exactly that rolling change for England&Wales deaths over time.

If either the total, or the weekly measure, is:

  • higher than the IFRp: means that deaths have skewed to the elderly, some thing like a carehome outbreaks. e.g. Austria, Denmark, S. Korea and Switzerland, but we should bear in mind that overall d/1m matters. So, in Denmark & S. Korea we are talking just a few deaths with old age skew, whereas Austria & Switzerland have done twice as badly as Sweden and the UK at protecting people in care homes.
  • lower: means skew to the young, and seems to happen in countries that had less effective lockdown measures. This might be an indicator of the likely natural skew of the IFRp towards the young, that lockdowns disrupt more, and that we aren’t accounting for in these equal spread IFRp estimates. After all the young tend to be more social. Or, that these countries have simply protected their elderly better at the same time as being more open. e.g. Ukraine, they might be the best place for those with #gbdelusion to look at replicating.

Nigeria: so picking up our promise to track Nigeria, in particular its 8d/1m, where it to have Italy’s demographic, it would be at at least 80. So still more differences to account for…

What other factors are at play?

1. Healthcare outcomes: differing insurance schemes, beds/capita, treatment outcomes/quality, etc — mean things may vary.

The key observation to make is, most 2nd/3rd world country C19 death figures on worldometers, come from hospitals, so we’re only finding out about the cases they test and treat, probably very well. Which means the IFRas from mostly Western studies are likely to apply reasonably well. Especially when equipment deficiencies are likely to be made up for by doctors far more used to dealing with respiratory diseases, like Tuberculosis, than ours are.

This means if we use these deaths, and our IFRas, to imply cases by age, it is likely that similar numbers of real world case will have generated those deaths, certainly 2nd world, perhaps not 3rd.

What is missing is all-cause excess deaths tell us about all those not so lucky to make it into hospital. In the 1st world, virtually none, but Italy has 3.5x the beds/capita of Nigeria.

2. Unreported C19 deaths outside hospital: this will cause the lion’s share of the difference, but most places outside the 1st world don’t publish weekly all-cause death figures like the ONS, they mostly do it annually.

News reports in Brazil are that hopsitals (free at the point of use like the NHS) filled up from May till Aug, and as this estimate of early all-cause data for Latin American cities shows, Manaus, has had 5x the rate of excess deaths, over C19’s, so we would be foolish to assume similar has not happened in Nigeria, etc.

Add to this, if you aren’t in hospital, you are far less likely to be tested for C19, with Nigeria having 1/100th the tests/1m rate of Italy. Granted only 2% in the West make it to hospital, so if 95% of that Italian testing is providing spread intel, and not to identify cases that will die, this probably means the 5x difference in Peru is a reasonable comparison.

Nigeria: so their beds/capita is 0.9/1k, but Peru’s is near twice that at 1.7, so we could be looking at say 9x — e.g. 80 x 9 = 720 d/1m, if reporting was better.

Once we know about these extra deaths, it is likely that their IFRs will be far worse in having been denied treatment.

Can we estimate how bad?

In the UK, the Hospital Fatality Rate (HFR) is reasonably easy to calculate, and it shows that since the shock of the 1st peak, where is hit 35%, due to worse treatment, and us simply only testing people as they died, it has now stabalised to 15%.

We also know that 50% of those in hospital need oxygen, so from this we can conclude that without oxygen the mortality rate outside is likely to be at least 3x higher.

So, once we know the extra deaths, we can adjust all these figures, but we need to make sure that we don’t overestimate spread, as these deaths will likely have needed only 1/3 of the cases to cause them.

3. Comobidity rates: as per this OECD study, obesity is a developed world problem. Which means despite our better healthcare, we are probably more likely to die from C19 than the developing world.

There reality is that there simply isn’t the data to try to account for these, it would be far too speculative a piece of guess work. However, we can gauge the magnitude. Given the OECD average elevates say India’s 5% by only 15%, it seems unlikely that this would treble mortality for 100% of people, like adding 10yrs to age does.

So with increased obesity likely offsetting treatment being better in 1st world hospitals, we may find this has less impact than we think.

We just need to be very aware that this is a bit unknown.

Nigeria: but, Nigeria’s obesity rate is about 1/3 that of Italy’s, so we could make an adjustment to the 720 d/1m we’ve adjusted so far and get to 2160 as Nigeria’s new adjusted figure. In doing so, we already starting to estimate that per comparable death they faired worse than Italy.

4. Lifestyle factors: the biggest of these is density, the degree to which we live close together, the frequency of contacts with an airborne virus is crucial, so far I’ve done two deep dive looks in this for Sweden vs the UK (here) and Switzerland (here), and it is considered to a lesser degree in the piece on Brazil that this article is expanding on.

Crucial, is to make sure you don’t do this naively, you have to look at urban density, isolating and matching up city centre densities. To give you an idea the piece on Sweden considers 291 separate areas, breaking these up and matching them up to near 35k units of population in the UK, concluding the UK is 2.8x more urbanly dense. This is the density comparison that matters.

Beyond this, the:

  • degree to which the elderly live at home or in carehomes
  • close knitness of family life
  • familiarity in greeting (Japanese formality vs Italians hugging and kissing)
  • etc

Will all impact the ease with which the virus spreads.

But, in a random chaotic system luck plays a part. Infect a superspreader early or not, etc. So a point in time comparison just finds countries at different stages in their overall spread, and no two will be the same.

Nigeria: density will play a part, it’s cities are likely more densely packed than Italian ones, but commuter rates are far lower, and Italy is 70% urbanised to Nigeria’s 50%, meaning there are just more peope living more sparsely. Even getting decent data to compare this is tricky.

But, we’re becoming very subjective. I think the point is proved, that Nigeria’s 8 d/1m, are not in fact when adjusted for the big measurable differences, not that dissimilar to Italy, in fact it is easy to argue a case, likely worse.

So how much spread and possible deaths have we had?

All these caveats in mind, accounting for demographic differences, is going to tell us something more accurate.

Here are estimates of:

  • the % of possible deaths
  • how much spread they imply there has been across age groups, to cause those

Most striking are Brazil & Peru, but to help understand we need to have a feel for the dynamic of these two numbers. For % of deaths all the weight of the number is in the older age groups, and with many deaths these groups will have the most certain IFRs.

The spread number is sort of the opposite, as a small number of deaths in the very young to imply very large numbers of cases, so these are very susceptible to the error in these lower age band IFRas, which we already know must be more noisy, as they are measured from far fewer deaths.

Case in point Peru cannot be at 101.81% spread. It is likely an indicator that healthcare outcomes may be relatively worse for the young in the 2nd world.

The most comparable number to Eng&Wal are from NY state. They’ve seen more than 2x our spread rate, but just 25% more deaths. So they’ve had far more spread in the young. The clearest indicator that given similar healthcare outcomes, and density, it can still be worse.

Sure, the IFRas may be overestimating this, but given we’ve already shown 2nd/3rd world all-cause deaths are going to be much higher, this is not going to be true. Remember the IFR of the deaths outside hospital are likely to be at least 3x higher, so will add far less spread, but even so, they are likely to show that the younger age band IFRas, derived from far fewer deaths, are likely to be too high, as spread cannot go above 100%.

Here’s the untold reality, Peru, more so than Brazil, is likely close to true herd immunity, and looking at their case rates and deaths, indeed Peru is not seeing a 2nd wave resurgence yet anywhere near as large as Brazil’s. It’s worth noting that Brazil’s is happening in their summer.

So how to deaths actually compare by age?

Finally, we can bring all the data together, here for each 10yr age band, we show how many more/fewer times, their deaths are over those in England & Wales, e.g. Peru has seen 4.73x as man 50–59 yr olds die.

We can make the following observations:

  • With 1/5 those >80, Brazil & Peru have had far higher deaths in all age bands, leading to the much higher implied spread. Now some chunk is going to be worse healthcare, but NY state has seen very similar ratios to Brazil, suggesting that both these (given Brazil’s healthcare is much better than Peru’s) have simply had more spread, and us more capacity for death.
  • It makes it pretty clear that despite claims (mostly backed by what they said, and mobility data that covers a far smaller proportion of their population) Peru has simply had the least effective lockdown globally.
  • Why are Brazil and NY state so similar? NYC is very dense, the only place in the West that gets anywhere near Favella density. Plus, do they have a similar work ethic?
  • Ukraine, in all ages <70, has had very similar outcomes to us. Measures have been similar, but they have done stonkingly better than us at protecting the elderly and vulnerable. They are perhaps the model to go and look at if you are a #gbdelusion fanatic.
  • The European Union generally, has probably locked-down too much, and been relatively worse at protecting their elderly than us.
  • South Korea, just makes us all look like we don’t get it.

The overriding thing to take away from this very extensive look at the data at play, is simply that capacity for mortality in the UK seems to be higher than we have experienced, and that when it boils down to it, you can start with Nigeria’s 8d/1m, and get it to 2160 d/1m if they have Italy’s demographic, quality of death reporting, and health/lifestyle factors.

Bottom line, adjust for, demographic first, then reported deaths, then healthcare, then density, and if you don’t have the cause, you’ve got your sums wrong, again, like Kendrick.

--

--

Simon Nicholls
Pragmapolitic

Father, quant analyst, journalist blogger & editor, libertarian, political pragmatist