Analysis of Poverty and Education in Appalachia

The subject of this analysis covers the poverty rate of the entire Appalachian region from 2005 to 2016. The data that used is from the Appalachian Regional Commission and covers all the counties that are within the Appalachian region defined by the ARC.
The final goal of this analysis was to see if there is a significant correlation between the percentages of educational level and the rate of poverty and the research question of this analysis is “Can educational level be related to the poverty rates in Appalachia?”

The first thing I looked at was the poverty rate of each state in Appalachia over time. For the rest of this report, when I refer to a state, I will be talking about the Appalachian region of the specified state.

We can see that various areas of Appalachia have different levels of poverty rates. For example, the Appalachian region in Kentucky and Mississippi have historically high levels of poverty rates while many other states like Virginia and Tennessee are grouped around a lower rate.

We can also see though this graph that the high poverty rates also result in a high percentage of the US average poverty rate. It is interesting to see that the states with lower rates have maintained their status quo while the states with high poverty (Kentucky and Mississippi) have lowered their percentage over the years. This could also be due to the economic trends though the US as a whole.

The next thing I wanted to look into was the percentage of high school completion compared to poverty rates by state.

We can see that there appears to be a strong connection between the percentage of high school diploma completion vs poverty rate. As the percentage of high school diplomas increase, the poverty rate decreases. One interesting thing to note is the interesting pattern of the groupings. It appears that there is a slight curve for each of the states and as the percent of high school completion increases, the poverty rate slightly increases, then proceeds to decrease. We need to ensure that the percentage of high school diploma completion is increasing over time before we can reach any conclusion.

In this graph, we can confirm that the percent of high school completion is constantly increasing over time for all states. One observation to note is how quickly the rate of high school completion increases over the years.

Lets see how percent of Bachelors Degree Completions looks when compared to Poverty Rate. We can expect the percent completion to be lower than the percent of high school completion.

The observation can be made again about the curvature of the clusters. It appears that there is always an increase in the percentage of bachelor’s degree completion but on the short term, the poverty rate still increases. After a few years, the poverty rate then seems to drop off. Again, we want to make sure that they poverty rate actually increases over time to solidify our hypothesis.

By looking at this graph, we again see that the percent of Bachelor’s Degree completion increases over time. The pattern of curvature for poverty rates may have something to do with the economic trends of the US as a whole but it may be reasonable to conclude that the increase in educational levels are helping with lowering the percent of poverty within Appalachia.

Finally, I wanted to see if we could find a line of best fit for this data.

By using an algorithm called stepAIC, we can narrow down the important factors and compute a best fit line. We can see that the poverty rate can actually be calculated using percent of Bachelor’s degrees and percent of High school completion.

The final equation is poverty rate = 0.612 — 0.478 *percent high school completion — 0.201 * percent bachelor’s degree completion.

We want to use a different test to see if we get a similar response.

By comparing the estimate coefficients in the table, we can see that this equation fits the data very well. Since the p-value returned was very small (2.2e-16) we can conclude that this equation is the best for this data.

We should check the general shape of the data, so we will test normality.

The Shapiro-Wilk normality test returns a p-value higher than 0.05, we we can’t say that this data comes from a population with a normal distribution. When looking at the Normal Q-Q Plot, we can actually see some normality as the line appears to be generally straight. We can go ahead with our model from before.

In conclusion, we were able to successfully predict the poverty rate of Appalachian States by utilizing the educational level percentages of High School and Bachelor’s Degree completion. This equation can be used to predict the poverty rate for each year by simply collecting educational completion percentages.

For future analysis, population, economic trends of the US, income, and unemployment rate can be factored in to get a more accurate prediction for poverty.

Works Cited:

“Data Reports.” Data Reports — Appalachian Regional Commission, www.arc.gov/research/DataReports.asp.

--

--