Coffee and Causation

We all love coffee. But how is this love of coffee, and our direct consumption of it affecting the lives we live? Is it having a positive impact or a negative one? Earlier this week, I stumbled upon an interesting article that made a bold claim of linking coffee consumption with a lower risk of death. This claim is interesting and intrigued me enough to want to dive into the causal relationship further in an exploratory data analysis. When I first read this I was thinking of what driving factors could be related to coffee consumption that correlates with a decreased risk of death. I’m going to examine the life expectancy aspect of this argument because that makes the most sense given the claim.

We have to dissect this claim further because the article gives little to no quantitative evidence to their statement. We are told that coffee drinkers have a lower risk of death. I’m going to give a more believable argument based on this idea: Countries with higher coffee consumption have a greater life expectancy. This revised claim is justifiable through data and is more specific. Personally, I would think that coffee consumption would have a negative correlation with life expectancy. You often think of people who drink coffee in excess are trying to compensate for something in regards to their energy. Natural energy results from getting good sleep, eating well, and reducing stress. If this natural energy isn’t there, then people usually resort to caffeine. Not having a good sleep schedule, or being really stressed out will certainly decrease your life expectancy if that pattern continues after a long period of time. I’m interested in dissecting the claim of the article mentioned further.

I found two datasets on Kaggle to help this process along one on life expectancy, and one on coffee consumption, both according to country.

Life Expectancy:

Coffee Consumption:

From here we can further attempt to justify the claim made by the article. I did a deep dive into the coffee dataset to understand which countries had the highest percentage of consumption. From these, we see that Brazil, Indonesia, Columbia, Ethiopia, Guatemala, Honduras, India, Mexico, Uganda, and Vietnam all possess the highest consumption rates.

From this chart, we can see that Brazil has the largest percentage of consumption and the others are not far behind. But, does this statistic correlate to the average life expectancy of each of these countries respectively?

Red line = Average Life Expectancy

This chart above shows us the life expectancy of a person residing in each of the countries expressed in the line graph from before. We can see from the red line above that most of these countries have a life expectancy below the average. This means that the author’s claim of coffee consumption is correlated with a lower risk of death is false. We have to bring in so many more factors to understand this relationship more in depth.

Coffee consumption may be in fact associated with a lower risk of death, but it is by no means a driving factor, as we can see from our analysis above. Other variables are obviously contributing to this statistical significance. We can take into account GDP, whether or not it is a developing country, geographic information and lots more.

--

--

Nathan Duffy
Spring 2019 — Information Expositions

Versatile Analyst Combining Data Analysis, Finance & Operations Experience | Proficient in Python, Salesforce, Project Management & Business Intelligence