Variables Effecting the Percent of Broadband in the US 2020

Delaney
5 min readApr 11, 2022

--

Picture from Pedernales Electric Cooperative, Broadband

Introduction

Broadband is essentially having access to the Internet, specifically it is a wide bandwidth data transmission which transports multiple signals at a wide range of frequencies and Internet traffic types—it enables messages to be sent simultaneously. Within the United States, there are over 2,000 internet service providers such as AT&T Wireless, Verizon, T-Mobile, Xfinity, Spectrum, CenturyLink, Cox Communications, and an abundant more. There are six types of broadbands: cable modem, fiber, wireless, satellite, digital subscriber line (DSL), and broadband over power lines (BPL).

From the 2020 United States Census Bureau, I will explore a few variables and their relationships. I will be focusing on the Percent of Broadband and Percent of Health Insurance, although I will also compare Percent of Broadband & Family Poverty and Percent of Broadband and Median Income to see how the relationships vary.

Data

The us_counties.csv file contains numerous amounts of variables. However the one I am focusing on is Percent of Broadband. The other variable I chose to compare Percent of Broadband with was Percent of Health Insurance. I found that Percent of Broadband and Percent of Health Insurance have an unusual relationship and that they are 37% correlated with one another which represents a moderately positive relationship. Now the question is which variable causes the other?

Comparing the Outcomes When X and Y are Flipped

The graph on the left in blue where Pct_Health_Insurance is on the x axis and Pct_Broadband is on the y axis, for each 1% increase in Pct_Health_Insurance is positively correlated with 0.79 increase in Pct_Broadband. The graph on the right in red where Pct_Broadband is the x axis and Pct_Health_Insurance is the y axis, for each 1% increase in Pct_Broadband is positively correlated with 0.51 increase in Pct_Health_Insurance. Keeping in mind that this data is from 2020 during the middle of the COVID-19 pandemic, I would originally say that Pct_Health_Insurance effects Pct_Broadband because I feel like the higher the percentage that an individual has health insurance, the more likely the higher the percentage that they also have access to the internet. But in this case, during 2020 and the midst of the COVID-19 pandemic when a majority of businesses, schools, and work activities took place remotely, I think that Pct_Broadband effects Pct_Health_Insurance and not the reverse.

Since the beginning of the COVID-19 pandemic there has been a new challenge that the U.S. faces—unequal access to broadband technology. So what are the connecting links that link broadband and health insurance?

Analysis

Threats to Internal Validity

The relationship between Pct_Broadband and Pct_Health_Insurance, I am going to discuss a couple threats to internal validity that are present.

Confounding: a third variable that is related to Pct_Broadband is influencing the observed effects of Pct_Health_Insurance.

There are for sure other variables that effect the percentage of having health care and the main variable here that also influences Pct_Health_Insurance is income. Essentially the more income an individual makes the more access they have to health care because they are able to afford the resources that lead to better and improved health. More income = higher Pct_Broadband = higher Pct_Health Insurance. The graph below depicts a strong positive relationship between Pct_Broadband and Median_Income with a correlation of 0.71. Each 1% increase in Pct_Broadband is positively correlated with $700 increase in Median_Income.

Positive Relationship between Pct_Broadband and Median_Income

Family Poverty could be another confounding variable that causes Pct_Health_Insurance. Similar to Median_Income, a higher rate of family poverty results in less access to the internet which effects the outcome of having health insurance. Pct_Broadband and Pct_Family_Poverty have a strong negative correlation of -0.62. The graph below depicts the relationship between family poverty and broadband. Each 1% increase in Pct_Broadband is negatively correlated with -0.18 decrease in Pct_Family_Poverty.

Strong Negative Relationship between Pct_Broadband and Pct_Family_Poverty

Events: major events happening during a study.

The data used for this project came from 2020 which happen to be during a major event that took place and is still currently taking place. The COVID-19 pandemic 100% has effected Pct_Broadband which has therefore effected Pct_Health_Insurance in the United States.

Pairplot of Confounding Variables

The above pairplot depicts the various variables and their relationships with one another. But I am only focusing on Pct_Broadband with Pct_Family_Povery, Median_Income, and Pct_Health_Insurance. Pct_Broadband & and Median_Income and Pct_Broadband & Pct_Health_Insurance both have positive strong relationships, but Pct_Broadband & Pct_Family_Poverty have a strong negative relationship. Median_Income is the confounding variable to the relationship between Pct_Broadband and Pct_Health_Insurance. Pct_Broadband doesn’t necessarily cause Pct_Health_Insurance, but Median_Income is an external factor that effects Pct_Broadband that causes Pct_Health_Insurance. Pct_Family_Poverty supports the idea that the higher the percentage of family poverty, the less income you make which would typically result in a lower Pct_Broadband effecting Pct_Health_Insurance. The same goes for the reverse, a lower Pct_Family_Poverty equals higher income resulting in higher Pct_Broadband which effects the Pct_Health_Insurance in 2020.

Conclusion

Two variables: Pct_Broadband and Pct_Health_Insurance are correlated with one another of 0.37. I found it interesting to explore other variables in the us_counties.csv file to help explain the correlation and relatedness of each variable. To an extent, Pct_Broadband causes the outcome of Pct_Health_Insurance, but is not 100% the cause of Pct_Health_Insurance. I found a couple threats to internal validity that were present in this dataset and at the time that this data was observed. Confounding and Events are two of the internal threats to validity that I discovered and described above. Median_Income and Pct_Family_Poverty are confounding variables to the relationship between Pct_Broadband and Pct_Health_Insurance. COVID-19 is the event that I found to be most prominent in what also had a major effect on Pct_Health_Insurance in 2020. As expressed there are multiple links between Pct_Broadband and Pct_Health_Insurance, and US citizens still face the challenge of unequal access to broadband technology.

--

--