The Story behind Native State Populations and COVID Deaths per Capita

Brennan Raml
Fall 2022 — Information Expositions
7 min readJan 24, 2023

This year, I have learned a lot about information exploration, as well as framing that exploration into a story. I’ve looked at datasets ranging from local to national level representations of the country, and personally, these have been my favorite datasets to work with. This week, we were given a county data frame that contains many different variables relating to age, sex, education, medical care, and political standings. While this data was collected from various counties in all fifty states, I wanted to focus on my home state of Colorado. As someone born and raised in this state, I was curious about the relationships I could discover and potentially learn more about.

To begin, I filtered the data frame to only include Colorado data and created a heat map. Creating a heat map as the first visualization allowed me to examine the relationships between all the variables in one visual. Here is the heat map I created:

Heat Map I created for starting analysis

Upon first looking at the different variables included in the data, I decided to focus on the percentage of people born in the same state as the independent variable. As someone who fits within this demographic, I was curious to see what relationships I could find at first glance. This is where I noted an unlikely positive relationship: The percentage of people born in the same state and covid deaths per capita. This got my attention immediately because there is no clear conceptual link between these two variables. Upon further regression analysis, the relationship is quantified by a value of .646 for the correlation coefficient. What’s even more interesting, is that the mean of the independent variable (percent born in the same state) increases by .001 as the dependent increases. This tells me that while these variables share a strong positive relationship, the relationship isn’t causal. The question becomes, what other variables are driving this relationship?

Again using the heat map, I examined relationships with the percent born in the same state as the independent variable. I was looking for variables that shared a relationship but also had more of a conceptual connection to the coronavirus pandemic. I noted that there is a strong negative relationship with the percentage of college graduates, as well as a strong positive relationship with percent of the population on assistance and in family poverty. Upon analysis of the correlation between the percentage of people born in the same state and the percent of college graduates, I calculated a correlation coefficient of .612. This finding showed me that I was on the right track. The opportunity of graduating college contributes to the quality of life, as well as other variables in this data that are related to health care and fiscal properties. By analyzing the relationships with the percentage of college graduates, I could find more specific variables contributing to the county’s covid deaths.

Scatterplot showing the negative relationship

I began looking at notable relationships where the percentage of college graduates is the independent variable. I began brainstorming different occupations that could have a negative impact on personal health such as factory jobs, manual labor, or serving the armed forces. Most of these positions are injury prone, so I focused on medical variables. The heat map shows that the percentage of college graduates has a negative correlation (stronger than -0.50) with the percentages of family poverty, assistance, and support for Donald Trump during the 2020 presidential election.

After analyzing these relationships independently, the connections between these different variables became apparent. The story begins with the analysis of the first two variables, which shows a negative relationship between the percentage of people born in the same state they reside, and the percentage of college graduates. As more of the county’s population is born in the same state, less of the population is shown to have graduated college. College is an opportunity that opens access to better-paying jobs and work conditions. This explains the negative relationship between the percentage of college graduates and the percentage of families in poverty as well as assistance. Without a college degree, the jobs available become more physically taxing and sometimes dangerous, as well as less pay. With a higher chance of health problems and less access to quality medical care, these county populations were at a higher risk of COVID mortality. The negative relationship between the percentage of college graduates and support for Trump in 2020 can also be connected when it comes to COVID misinformation. As less of the population is reported to have graduated college, more are likely to have graduated high school. When discussing this dataset in class, we visualized an inverse relationship between these two variables and support for Trump in 2020. As more people completed a higher level of education, support for Trump decreased. Having a lower percentage of college graduates in a county could mean that county was more prone to misinformation regarding COVID and public safety, which also likely contributed to the COVID deaths per capita. The story behind the originally examined relationship is a great example of how correlation doesn’t necessarily prove direct causation.

Now that we better understand the relationship between the percentage of people born in the same state and COVID deaths per capita, let’s further examine specific counties to discover the story in this local area. I first began by sorting Colorado counties by their covid deaths per capita. The county with the most is Bent County with 0.008279, with Otero County following with 0.006891. I also sorted the counties by the percentage of people born in the same state and found that Bent County has a higher value of 60.8%, making it within the top 10 counties of that variable. I had not heard a lot about Bent County before and therefore could not make any assessments from this data alone. I researched more about the county’s population health and education via a USNews dashboard.

With Bent County being the prime county for this investigation, this dashboard could give me insights into the county’s standings compared to others in the state. It scores each county in various categories relating to the quality of life out of 100 and then compares them with national averages. When grading population health, Bent County earned a score of 41. This category is assessed by access to care, healthy behaviors, health conditions, and health outcomes. The population possesses a higher-than-average smoking rate, and almost one-third of the population does not exercise during their free time. In terms of education, Bent County earned a score of 18. The population has a high school graduation rate of 69% (the national median being 90%), and 21% possess an advanced degree. In terms of the economy, Bent County scored 27 with a higher-than-average poverty rate and less-than-average household income.

The above data indicate that Bent County is behind other counties in population health, education, and the economy. Using the previously discussed connections between the county dataset’s variables, we can use the dashboard data to explore the possible stories that led to Bent’s covid deaths. To compare the data and reinforce the previously found relationships, I will be looking at Pitkin County as well. Pitkin County has a percentage of 28.3% people born in the same state and is considered rural like Bent County. As mentioned previously, the percentage of people born in the same state has a negative relationship with the percentage of college graduates. With Bent County having a percentage of 60.8% for people born in the same state, this lines up with the high school and college graduation rates. Pitkin County has a 90% high school graduation rate with 67% having graduated college. With education allowing access to better work conditions and pay, Pitkin County has a higher labor force participation and higher median household income. The prevalence of higher education also influences vaccination trust and the trusting of misinformation. As of 12/14/2022, Bent County has a fully vaccinated percentage of 27.5%., compared to Pitkin’s percentage of 86%. Bent County also shifted their support of Trump from 2016 to 2020 by +4.4, whereas Pitkin shifted by -1.

Final visualization of all Colorado Counties

Above is a visualization comparing the macro relationships we’ve discussed with all counties in Colorado. It shows the negative relationship between the percentage of people born in the same state and the percentage of college graduates. The color of the dots represents the percentage of the population that supported President Trump during his 2020 campaign, and the size represents the COVID deaths per capita. This visualization effectively shows the negative relationship for each county, as well as the relationships with variables ‘trump_2020’ and ‘covid_deaths_percapita”. Pitkin County can be seen in the top left of the graph, with one of the top college graduation rates. We can see that less than the majority of the population supported Trump during 2020, and COVID deaths were significantly lower than in other counties. Bent can be seen on the bottom right with the majority supporting Trump during 2020, and a significant increase in COVID deaths. After this analysis, we can conclude that a powerful indirect relationship exists between the percentage of people born in the same state and COVID deaths per capita.

--

--