Life Expectancy. How does a countries’ development factors affect it?

Jacob Punter
4 min readDec 16, 2019

--

The Global Health Observatory is a subsidiary of the World Health Organisation (WHO). GHO collects and publishes more than 1000 health-related indicators for its 194 member states. Here we take a brief look at just a small selection of this data to attempt to discern what mass factors change a countries’ life expectancy, what countries have the highest and lowest life expectancy and why, and if we can estimate life expectancy with just a limited amount of data.

A brief look at WHO: Bringing health to life

What factors most affect life expectancy?

We can answer this question with a correlation heatmap. Correlation maps simply give a numerical value between -1 to +1 to tell us how positively or negatively correlated two variables are. The heatmap element simply maps a colour gradient to these values.

Correlation heatmap of Life Expectancy vs numerical variables

As we can see, the factors that are the most strongly positively correlated to life expectancy are:

  • Income composition of resources (Human Development Index in terms of income composition of resources (index ranging from 0 to 1)) — Basically how developed the country is appears to directly relate to life expectancy.
  • Schooling — Another indicator of a countries’ development, more time spent in education suggests more funding which suggests a more developed country.

And the factors that are most negatively correlated:

  • Adult_Mortality — Unsurprising, as mortality of the bulk of the population decreases, the life expectancy increases
  • HIV/AIDS — HIV/AIDS are not particularly prevalent in developed countries and are suppressible with well funded healthcare.
  • Thinness — Thinness in adolescence again links to development status and wealth of a country. The wealthier and more developed a country, the less likely the younger generations are to be malnourished and die.

Other factors to note:

  • Polio, Hep B, Diphtheria vaccination rates have a not insignificant impact on life expectancy
  • Strangely, more alcohol consumption is positively correlated to higher life expectancy. I don’t think there is enough information to allow much speculation on this, perhaps it’s a sign of development to enjoy legal recreational drugs?

What countries of the world have the highest and lowest life expectancy? Why?

Bottom 5:

Life Expectancy between 53–57

Lesotho, Côte d’Ivoire, Sierra Leone, Angola, Chad. Similarities:

  • All of these countries lie on the African continent
  • All have a ‘Developing’ Status
  • Each state a strikingly high adult mortality of at least 39%
  • None of these countries have a GDP of more that $5000
  • Vaccination rates drop as low as 50\% in Chad

Top 5:

Life Expectancy of 89

Belgium, New Zealand, Norway, Italy, Finland. Similarities:

  • Most lie on the European continent
  • Finland strangely has a developing status, but otherwise these are all see as ‘Developed’ countries
  • All have an adult mortality rate of less than 10%
  • Infant mortality is even lower, never exceeding 0.3%
  • Lowest GDP of these top 5 is $38,000
  • Where available, these countries all have a minimum vaccination rate of 95%

Can life expectancy be predicted based on this small data selection?

Yes! We can fit a Linear Regression model to this data, which gives us seemingly good results.

  • r squared on the training data: 0.969.
  • r squared on the test data: 0.935.

This shows us that when the model tries to fit blind data in this supervised learning model that it only loses about 3.5% accuracy over its training set.

Why do we even care about trying to predict life expectancy when it is usually available as a standard statistic of a country?

Within the data-set provided by the WHO — the world leader in health governance — there was still plenty of missing statistics from various countries. This was far more prevalent in poorly developed countries likely because they lacked the infrastructure and resources to collect this sort of data. If we have even a limited amount of data from that country then in theory we could roughly predict their life expectancy or perhaps other missing health statistics. Further work is beyond the scope of this blog-post.

Appendix

Full analysis of this data-set, along with access to more visualisations is available on github: https://github.com/JPunter/data_science_blogpost/

Data used for this blogpost can be found on kaggle: https://www.kaggle.com/kumarajarshi/life-expectancy-who

Thanks goes out to WHO and GHO for data collection and to Udacity for the opportunity to build this first blog-post.

--

--