Is Appalachia what you think it is?

An analysis of the region through Data Analytics

--

Data has an incredible ability to put things into perspective. Through the analysis and visualization of data, we are able to find insights and trends of areas which would ordinarily be too difficult or inaccurate to understand. This is exactly what Appalachia needs.

Throughout U.S. history, Appalachia has been painted over by sweeping stereotypes, nearly all of which are negative. These negative stereotypes have living consequences for the people living in the region as well. If you were to imagine the “model Appalachian,” what would that person be like? For many outsiders, the first thought that comes to mind might be a hillbilly, Trump-lover, or a miner. Many of these imaginations are probably uneducated, uninsured, or impoverished. Putting aside the moral implications generalizing others, are these stereotypes fair? Do they give insight into the current conditions in Appalachia or for Appalachian peoples?

These are questions that I believe data analytics can answer, so I set out to demonstrate this through the visualization of the region on a map. In order to map the region, I decided that state data would be an unfair data point size for my visualization. Afterall, it would be wrong to include the people of the outer banks of North Carolina in the region of Appalachia. In my research, I was unable to find data set sizes smaller than county-level data which met my needs, so I decided to use county-level data. In terms of comprehension and complexity, no data source compared to the data collected by the 2010 Census. While at first I had my reservations about the accuracy of the data collection process of the census, in my research I concluded that the Census Bureau’s process was more than accurate for my needs. Moreover, economist James Sylvester was able to conclude that the population estimates were accurately reported by the Census Bureau as well (Sylvester).

So I began collecting county-level datasets from the 2010 census, USDA projections, and 2016 Presidential Elections. The census data was taken from the Census Bureau’s website (Stewart, Carolyn, and Administrative and Customer Services). The projections were taken from the USDA’s website (USDA ERS). Finally, the presidential elections were taken from Townhall.com’s website (Townhall). After collecting these datasets, I used an SVG file taken from Wikipedia as a map to use for my visualizations. Then, I coded a Python script which parses my datasets and generates a visualization. To determine the colors and ranges used, I followed the guidelines laid out by Lin (Lin 401).

In my analysis, I found that while many stereotypes about the Appalachian region were found from some truth, they are largely out of proportion when put in context of the entire country. In fact, nearly every stereotype assigned to the region would be more accurate for another state. I think my findings are important, because this is a perspective of the region that largely gets lost in an ocean of bias and stereotypes.

To begin, let’s look at education in Appalachia.

Education

First, let’s see the rates of “non-education”.

Adults with less than a high school diploma per county 2012–2016

USDA 5-year averages of Census Bureau’s 2010 Census data: https://www.ers.usda.gov/data-products/county-level-data-sets/

This visualization shows the average rate of adults with less than a high school diploma as an average over 2012–2016. The data was taken from the U.S. Department of Agriculture’s five-year averages of projections from the U.S. Census Bureau’s 2010 Census data. It is worth noting that this data is based on estimates, but these estimates are more relevant to the modern day than the raw data from the census in 2010. Also, the cluster of counties shown on the right align with the Appalachian Regional Commission’s definition of Appalachia. This is the definition that will be used for the remainder of this analysis, but some people’s definitions of Appalachia may be different.

With all of this knowledge in mind, what can we learn about Appalachia from the data we see? First of all, the region appears to be fairly polychromatic. There are counties in nearly every state with relatively low rates of adults without education, but there are certainly counties with statistically high rates. Eastern Kentucky, rural West Virginia, Northern Alabama, and Northern Georgia clearly have areas that are falling behind the rest of the country in terms of high school graduation rates. But with that said, the overall “redness” of the region seems relatively in line with the southern region at large, and the region is no more “red” than, say, California. Let’s negate this graph and look at it with a different color scale…

Adults with high school diploma or more per county 2012–2016

USDA 5-year averages of Census Bureau’s 2010 Census data: https://www.ers.usda.gov/data-products/county-level-data-sets/

Using a different color scale with the same data source, instead of showing the rate of “uneducation,” this graph shows us the rate of adults with any education equal or above a high school diploma. This color scale gives contrast between counties which are above the curve and counties which are below in terms of education. Here we can see that the regions identified before stand out on this scale. In general, the region is mostly yellow-green, with the exception of some red areas and the very dark green counties of Pennsylvania and New York. In fact, this northern vs. southern trend can be seen for the entirety of the country. In my opinion, if a divide in education exists between two regions of the U.S., then it is between the North and South, not Appalachia and Non-Appalachia. Let’s go further than high school degrees; let’s look at college.

Adults with some college or more per county 2012–2016

USDA 5-year averages of Census Bureau’s 2010 Census data: https://www.ers.usda.gov/data-products/county-level-data-sets/

Here we can see the percentage of adults with some college or more. Here we can also see where the “uneducated” stereotype is probably born from. In terms of “greenness,” Appalachia is falling behind in terms of college enrollment. Specifically, West Virginia and Kentucky. These states are clearly underperforming, and it certainly looks like they need change in their education systems. However, North Carolina, Virginia, and Northern Georgia have very respectable rates in comparison to the rest of the country.

Adults with Bachelor’s degree or more per county 2012–2016

After filtering out adults with less than bachelor’s degrees, we can see the rate of higher education in the region. While sparse for much of Kentucky and West Virginia, many counties in Appalachia have at least a quarter of adults with a Bachelor’s degree or higher.

So let’s compare the Appalachian region with the rest of the country:

USDA 5-year averages of Census Bureau’s 2010 Census data: https://www.ers.usda.gov/data-products/county-level-data-sets/

These pie charts show the average rates across counties, rather than the total rates across population. In other words, these are the averages of each county, rather than the entire region. With that in mind, let’s look at the data. First, counties in Non-Appalachia are a majority college-educated, but this is not the case for Appalachian counties. Second, rates of high school education or less are noticeably higher for Appalachian counties. But finally, the differences in portion are not large enough to suggest that an Appalachian is less educated than a Non-Appalachian, because both of these graphs are similar in structure and size.

Income

Next, let’s look at the household income for the region.

Median household income in 2009 in dollars

This data was taken directly from the 2010 Census. The values shown are the median income of the region after adjustment to 2009 inflation. However, they do not take into account the cost of living in the region, so it is important to keep this in mind. Looking at this data, we can see that the Appalachian region looks greyed-out, and lacks color in comparison to much of the rest of the country. This suggests that the region has noticeably lower rates of income. It is important to remember, however, that this does not necessarily mean that the purchasing power of Appalachians is less than the rest of the country, since the cost of living in the region is likely very different.

To get a better indication of income conditions, let’s look at the rate of benefits in the region.

Households with Food Stamp/SNAP benefits per county 2005-2009

Here, we are given a clearer indication of income conditions like poverty. Unfortunately, we see another indication of eastern Kentucky and West Virginia falling behind the rest of the country. Let’s look at income assistance as well:

Households with cash public assistance income (or Welfare) 2005–09

Here, rates of public assistance in income for the region appear relatively normal in comparison to the rest of the region. Unfortunately, this is not necessarily an indicator of health, since these rates are heavily dependant on local laws and customs.

As a whole, it’s clear to see that there are regions of Appalachia that are in need of economic development and reform. While we cannot necessarily conclude that the drop in income in the region indicates a lower purchasing power, the high rates of food stamp or SNAP benefits suggest that there are areas within the region that suffer from higher rates of poverty.

Health Insurance

One indicator for quality of life is access to affordable healthcare.

Percent of population with health insurance 2009

Here, we are given a more positive indication the well-being of Appalachians. Rates of insurance are high in comparison to the rest of the country, and a vast majority of the region is over 80% insured. Here is a direct comparison of health insurance rates across the total populations of Appalachia and Non-Appalachia:

As a whole, Appalachians are more insured than Non-Appalachians.

Mining: The Ghost Industry

One major driver in Appalachian history and development has been the mining industry. But how important is mining in the region now?

Total Employment in Mining by SIC Code 2005–2009

Here we can see the total number of mining employees based on SIC industry codes. If you can’t see most of the map, that’s because mining is incredibly location-dependant. In addition, automation has replaced many of the mining jobs in the past twenty years. Let’s look at this graph instead as a percentage of total employees per county.

Percent Employment in Mining by SIC Code 2005–2009

Now the map looks even tougher to see. And this isn’t simply because of the color scale. There were relatively few counties with less than 3% mining as their employees. However, the counties that do mine have relatively large portions of their employees in mining companies. In Nevada, one county even approached 80%.

So how much money do these companies bring into the region?

Total Earnings in Mining by SIC Code in dollars 2005–2009

From this data, we can tell see that many of these mining counties are earning millions of dollars per years. But how do these numbers compare to earnings as a whole?

Earnings from Mining as a percentage of Total Earnings 2005–09

As we can see, mining accounts for upwards of 15% or more of the total earnings in many of these “mining counties.” However, these counties are primarily located in West Virginia, Kentucky, and Pennsylvania for the Appalachian region. As a whole, mining does not account for much of the total employment or earnings of the region. While no one can deny that mining has historical significance for the reason, this data refutes any claim that Appalachian economies are tethered to mining industries like coal extraction or stereotypes that Appalachians are notoriously miners.

Politics

Finally, let’s look at the political affiliations of the region during the past presidential election. This data was taken from Townhall.com, an independent political reporting organization.

Percent of votes for GOP in 2016 Presidential Election

As we can see, rates of total votes given to the GOP candidate are very high in the Appalachian region. Certainly, it is accurate to state that President Trump had a massive following of people in the Appalachian region. However, this majority does not seem nearly as much of the midwest.

Percent of votes for Democrats in 2016 Presidential Election

For the democratic party, Appalachia is significantly less colorful under the same color scale. Much of the region has less than 30% of total votes given to the Democratic candidate. However, it is worth noting that this does not necessarily mean that less than 30% of the region’s votes were given to the Democratic candidate.

Difference in percentage of votes in 2016 Presidential Election

Here we can see the percentage difference in total votes per county. As we can see, a huge portion of the region voted primarily for the GOP candidate. So how do the total votes tally up?

Vote comparison between Non-Appalachia and Appalachia

The total tally of votes in Appalachian counties given to Donald Trump was 63.7%, while Hillary Clinton was given 32.7%. This difference is significantly higher than the rest of the country, which voted slightly in favor of Hillary Clinton. An argument could certainly be made that the Appalachian region contains an overwhelming majority of Trump supporters, but on average, Appalachian counties gave about 33% less of their total votes to Hillary Clinton than Non-Appalachian counties. Is this difference enough to warrant the stereotype that all Appalachians are Trump-supporters? I would argue not.

Final words

In summary, the stereotypes assigned to the Appalachian reason are born from truth — some truth. In nearly every case, these biases towards the region are heavily overstated and out of proportion when put in perspective of the rest of the region. In many areas where Appalachia was given a stereotype, Appalachia actually outperformed states like California and Texas. While Appalachia has areas where it needs to improve to meet the rest of the country, these differences pale in comparison to the image assigned to the region.

Work Cited

“Breaking 2016 Presidential Election News, Polls and Results.” Townhall, townhall.com/election/

“County-Level Data Sets.” USDA ERS — County-Level Data Sets, www.ers.usda.gov/data-products/county-level-data-sets/.

Lin, Sharon, et al. “Selecting Semantically‐Resonant Colors for Data Visualization.” Computer Graphics Forum, vol. 32, no. 3pt4, 2013, pp. 401–410.

Stewart, Carolyn, and Administrative and Customer Services. USA Counties Data File Downloadss, www.census.gov/support/USACdataDownloads.html.

Sylvester, James T. “Population stats: how accurate are the estimates?” Montana Business Quarterly, Autumn 2012, p. 16+. General OneFile, http://link.galegroup.com/apps/doc/A310741503/ITOF?u=viva_vpi&sid=ITOF&xid=adcce8d3. Accessed 28 Apr. 2018.

--

--