AI model suggests high-risk groups for COVID-19 starting at age 45

Kunumi
Kunumi Blog
Published in
6 min readApr 10, 2020

--

Preliminary findings point to a non-obvious pattern in the relationship between a country’s population pyramid and its predicted number of deaths.

Leia esse artigo em português.

ABSTRACT
Analysis of the impact of different data points on the predictions of a machine learning model indicates that a larger elderly population leads to an initially quieter and apparently flatter mortality curve, but which is followed by a very rapid progression in the number of deaths. A similar pattern occurs with the population of adults aged between 45 and 49 years old, indicating that a larger population in this age bracket is interpreted by the model as an important feature to explain a higher number of deaths.

The new coronavirus pandemic hit a critical mark globally, with the adoption of social distancing and isolation measures by practically all government authorities and private sector agents worldwide.

At Kunumi, we believe in science, knowledge, and technology as tools to light the darkness ahead. As such, we propose that interpretable machine learning models — a subgroup of artificial intelligence — be used to better understand complex patterns and phenomena which may be invisible to human cognition.

In this essay, co-authored with LIA-UFMG (Artificial Intelligence Laboratory from Minas Gerais Federal University), we highlight some attention-worthy patterns that emerged from our analysis of the predicted COVID-19 mortality curve through AI models. These predictions allow us to visualize which data points have a greater influence on the predicted number of deaths due to COVID-19 in the future.

Many models can predict the number of COVID-19 related cases and deaths. However, not all of these initiatives make the effort of explaining these predictions (so as not to make them “black boxes”). It is our commitment to build explainable and transparent models that follow understandable and auditable decision processes. Based on the model’s predictions, we can observe relevant patterns and trends which hold the potential to influence decision-making regarding coronavirus contention measures.

We invite the scientific community to collaborate with this debate. We also plan to publish more analyses and tools to help prevent and mitigate the pandemic’s effects.

What we did

Our model combines the official death toll data released daily by WHO (World Health Organization) with information from a variety of related datasets: demographic (population density, age distribution, etc.), health (doctors per capita, hospital beds, etc.), urban mobility, climate, geography, and more. Our initial insights here are based on the age distribution.

To evaluate our model’s error rate, we had it predict the number of deaths in past days and compared it to the real number published by WHO. In this comparison, the model’s error tends to be effectively low — we also synchronized the time span variable between different countries. With this, the model takes into account differences and similarities between various countries, leading to a more precise prediction.

The model’s prediction

Using Brazil as an example, the number of deaths predicted by the model for the day after the writing of this essay was 0.070 deaths per 100 thousand people. One week before, this number was 0.021. Here we predict the values for March 31, published in WHO report number 71 — with data from different countries.

The non-obvious pattern

A distinct pattern emerges when we analyze the relationship between the age distribution data (in all countries) and its influence on death toll predictions.

Intuitively, we would be led to believe that countries with a larger elderly population are immediately expected to have higher mortality rates due to this population’s susceptibility to the virus.

This is an obvious, easily understandable and justifiable pattern — and, thus, widely accepted.

Although this is true for longer timeframes (and in a certain moment in time), this chart shows that this pattern follows a counterintuitive trajectory over time. Initially, having a larger elderly population leads the model to predict a lower number of deaths.

One possible explanation for this finding is that elderly people, more likely to require special care, already tend to adopt some isolation measures, spending less time in contact with other individuals. This behavior is analogous to the social distancing recommendation and therefore contributes to a more optimistic prediction by the model.

However, as the epidemic evolves, the large elderly population is interpreted by the model as a determinant factor contributing to the increase in the number of deaths — and such contribution increases exponentially.

Our hypothesis here is that at this stage, the younger population, less prone to reclusive behavior, contracts the virus and transmits it to the elderly through physical contact.

There is a reversal in the previously observed pattern — as a high-risk group, a large population of elderly individuals is now interpreted by the model as an important feature leading to a more pessimistic outlook.

This is a significant trend to observe as it reflects the possibility that populations with a large number of older people may take longer to identify the steepening of the mortality curve — and so, adopt more relaxed isolation measures.

The importance of the 45–49 age bracket

Another non-obvious pattern that emerged from the model lies in the relationship between the presence of a high number of adults aged 45 to 49 in a population and the predicted death toll.

WHO defines the high-risk group for COVID-19 as individuals older than 60 — or with preexisting conditions. However, our model suggests that a large population aged 45–49 already gets interpreted by the model as a reason for a more pessimistic prediction.

Combat strategies

The effect that different isolation measures have on the spreading of the virus is of vital importance. One recurring strategy is that of vertical isolation — a mitigation method that proposes the isolation of high-risk groups only as a way to combat infections. Younger people here would carry on with their activities as usual.

The patterns that emerged from our model seem to suggest that this may not be the best strategy. The isolation of groups of elders only does not seem to be enough to contain the infection of this population. Considering the system as a whole, the isolation of everybody proves to be more effective in combating COVID-19.

It is important to make the disclaimer that the findings portrayed here do not represent definitive conclusions. They represent the start of a conversation about the current moment we are living in the course of the pandemic — and we invite the scientific community to participate. The challenge of stopping the coronavirus pandemic is immense, and will not be surpassed without the collaboration and joint efforts between different minds. Do not hesitate to contact us with your considerations and we will find ways to collaborate.

Editor’s note: this article was updated on April 14 to clarify the date of predictions in the model.

--

--

Kunumi
Kunumi Blog

Sharing knowledge is part of our mission. Join us!