Does High School graduation affect Adult Obesity?

Max Vali
3 min readApr 14, 2022

--

INTRODUCTION

Finding an unusual relationship takes time. Thinking of all the externalities that are involved with each variable can effectively change if two variables are related or not. In this data driven analysis, I hope to outline an unusual relationship between variables. After a selection process, the unique relationship that we will be exploring today is between; Adult obesity raw value, and Percent High school graduate. At face value, these variables do not seem correlated. But by the end of this analysis I hope to prove that they are. Moreover, I hope to verify the internal validity of this relationship as a causal relationship.

DATA

With nearly 1000 different variables to look at across three different files, narrowing down the scope can be somewhat of a challenge. With health related data, unemployment data, and US counties data there is a plethora of variables to choose from. To tackle this challenge, I started by making a one large data set that stacked all of the files into one. From there I set up a skeleton of a correlation test and scatter plot with a linear regression line. I used two variables to call the data, making it easier to test as many variables as possible. Then I would keep testing variables at random till I had 3 possible relationships with a correlation above .2 or below -.2.

ADULT OBESITY vs PERCENT HIGH SCHOOL GRAD

After selecting one of the three possible unique relationships, it was time to try and prove it as a causal relationship proving the internal validity of the relationship. For our relationship between obesity and graduation, first I would like to identify the type of causation. I would classify this as a simple causation, because the percentage of highschool graduates has a direct positive relationship with adult obesity. Below is a scatter plot with a linear regression line to demonstrate this relationship visually. It does seem like there is definitely a relationship between these two variables, however we should identify and look for threats to the internal validity of this relationship. The first threat that I can potentially see is attrition. Over time, graduate rates are increasing, but so are obesity rates. According to the CDC, obesity rates are increasing every year so this needs to be expected. Also, over the course of the three decades, high school graduation rates have also improved. With both variables increasing at a steady rate, this poses a threat to the internal validity. Another potential threat is confounding. There are lots of threats here that could affect adult obesity and graduation rates. The selection process of data is yet another threat as well. The ways in which people have classified people as obese, has changed over time.

CONCLUSION

After visually plotting these relationships as well as documenting the correlation between them it is safe to say that there is definitely some correlation between Adult obesity and Percent highschool grad. Furthermore, I believe that this is a relatively casual relationship, in which percent highschool graduates affects adult obesity. Although I am not entirely sure how the adult obesity raw value is interpreted, when testing this variable against others it did not show a strong relationship, helping secure the internal validity. When independently testing percent of high school graduates against other variables I found it rather interesting that usually these relationships were negative, however when compared to Adult obesity it isn’t, strange! Plotting both of these variables next to each other shows just how similar they are, as a histogram they nearly mirror each other. A simple correlation test between the two will result in .43 percent correlated.

--

--