If you type “food deserts” into Medium’s image search, you’ll find pages of pictures like the one above. Looks delicious, but very different. Food deserts are actually quite hard to define. They refer to areas with limited access to grocery stores, supermarkets, or other sources of healthy and affordable food. Low access is often measured by distance to the nearest supermarket, paired with other indicators such as vehicle access and household income. They’re one of many indicators of socioeconomic injustice.
The purpose of this data product is to identify what metric(s) exactly define a food desert, examine how low access differs by various demographics, and understand the extent to which food deserts impact various health outcomes. More broadly, by mapping out food deserts, this research can be used to recommend where low-price grocery chains such as ALDI should open stores next in order to impact the most vulnerable populations.
My primary dataset is the Food Access Research Atlas from the USDA Economic Research Initiative, which contains census tract-level data. I aggregate this dataset to the county-level and merge with 2015 County Business Patterns from the U.S. Census Bureau and 2015 County Health Rankings. All cleaning, analyses, and visuals were done in R.
Overview of grocery store count
Let’s begin by plotting the distribution of grocery store count at the county level. As we see in the chart below, grocery store count is very right-skewed. A small portion of counties have a very high number of grocery stores compared to their population. However, it turns out that the top 10 counties with the most grocery stores per 100K people are in Alaska, Colorado, Nebraska, Montana, and Oregon. These counties simply have very low populations, inflating our grocery store count metric.
This serves to show that food access is a more nuanced problem to identify, and grocery store count is not a sufficient measure.
Overview of census tract-level data
First, let’s briefly explore the demographics in our census tract-level data. As the chart below shows, about 3/4 of all census tracts in the U.S. are classified as urban. Moreover, urban tracts have a higher low income proportion than non-urban tracts. This is important to note for two reasons:
- Living 10 miles away from a supermarket feels very different as an urban resident vs. a non-urban resident. Thus, when we examine data in the next section on proximity to a supermarket, we need to understand that distances of even 1 mile and beyond are signals of low access for a large number of urban residents.
- We haven’t examined the relationship between income and food access yet, but if this relationship exists, urban residents may disproportionately experience low food access.
Supermarket access by ethnicity
The first chart below illustrates the percentage of different non-white ethnicities who live more than 1/2 mile, 1 mile, 10 miles, and 20 miles from a supermarket. The second chart below puts this into perspective, depicting the breakdown of ethnicities in the total non-white U.S. population.
At 1/2 mile and 20 miles, the proportions match up fairly well. However, the American Indian and Alaska Native population disproportionately lives more than 1 mile and 10 miles away from a supermarket. Especially for the latter, the AIAN population comprises 56.4% of the non-white population living more than 10 miles from a supermarket, while only comprising 2.2% of the total non-white U.S. population.
Taking a broader look at supermarket access for minority ethnicities, it appears that at least in aggregate, non-white ethnicities do not experience disproportionately low access to supermarkets compared to the white population. In fact, the chart below shows that the white population comprises 62.2% of the total U.S. population but 73.6% of those living more than 10 miles away from a supermarket.
Poverty and supermarket access
Next, let’s revisit the relationship between income and supermarket access. My initial hypothesis was that there would be a significant relationship between income and supermarket access, for a number of reasons. Perhaps supermarkets want to open stores in areas with a higher income population, or a low income population lacks the resources to support the operation of additional supermarkets.
The two charts below depict the distribution of poverty rate for census tracts that are classified as low access at 10 (and 20) miles and those that are not. Note that the data are filtered to exclude outlier tracts with poverty rates of 50% and above. We can see that tracts classified as low access at 10 miles have a slightly higher median poverty rate than those that are not (14.5% vs. 12.8%). For tracts classified as low access at 20 miles, the spread is greater (15.7% vs. 12.9%). This suggests that there is somewhat of a relationship between poverty rate and supermarket access, but not as significant as I would have expected.
Perhaps distance to a supermarket alone is not the best measure of food deserts. Higher income suburbs tend to sprawl, and residents in those areas likely drive further to supermarkets. Clearly, this is not the same as experiencing food insecurity. The chart below uses a different measure for low access: the percentage of a tract with low vehicle access living more than 20 miles from a supermarket. Note that a tract is defined as having low vehicle access if more than 100 households in the tract do not have a vehicle and are more than half a mile from a supermarket. Here, there is a clearer negative relationship between income and food deserts.
Given that income often relates strongly with health outcomes, the above chart foreshadows that there may be a relationship between supermarket access and health outcomes.
Supermarket access and health outcomes
The pair plot below shows the pairwise relationships between supermarket access and four health outcomes (% fair or poor health, physically unhealthy days, % diabetic, and life expectancy). The plot is colored by a boolean variable (True/False) indicating whether at least 50% of the county’s census tracts are classified as low income. I chose the variable mentioned above (low access at 20 miles and low vehicle access) as a measure of the share of counties that are in food deserts.
The first column and first row of the plot shows that there is somewhat of a correlation between supermarket access and the health factors chosen. Low access is most correlated with % fair or poor health and life expectancy. Although the correlations are not extremely high (0.434 and -0.401, respectively), it’s jarring to see that living in a food desert can actually take away years from a person’s life. I would’ve expected an even higher correlation for % diabetic, as other research has shown. One possible reason for this lower correlation is that in order to merge with county level health data, I had to aggregate food access data from census tract level to county level, taking averages of the boolean low access variables. Perhaps if I had tract level health data, this relationship would be more precise.
One additional insight: Notice how more blue dots are clustered towards the top of the scatter plots in rows 2–4, and towards the bottom of the plots in row 5. This confirms that low income counties (blue dots) tend to have a higher unhealthy population and a lower average life expectancy.
To confirm our earlier hunch, the pair plot below uses just distance to a supermarket (1 mile for urban tracts and 20 miles for non-urban tracts) as a measure of food deserts. When we no longer consider vehicle access, correlations with health outcomes are cut in half. This proves that “low access” is not just about the presence of supermarkets or even distance to supermarkets, but also the ability to bridge that distance.
Conclusions & future applications
This exploratory analysis begins to paint a picture of what exactly it means to live in a food desert. First, we found that grocery store count per 100K people is not a good indicator of food deserts because counties with low populations have an inflated measure. Second, we determined that distance to a supermarket may also not be the best indicator of food deserts. This is because of the confounding effect of vehicle access and the fact that urban and non-urban tracts experience distance differently as a result. Thus, from the variables analyzed in this project, distance, in conjunction with vehicle access, appears to be the best indicator of food deserts and potential predictor of health outcomes.
The map below colors U.S. counties by the percentage of census tracts that I classify as food deserts, i.e., more than 20 miles away from a supermarket and with low vehicle access. Low-price grocery stores like ALDI can use the data underlying this map to identify where to expand next. Local governments can use this research to improve public transportation options and bridge the gap to supermarkets.
Ultimately, this research shows how nuanced food deserts are and how many variables define what it means to be food insecure. If I were to expand on this research, I would look into the confounding impact of group quarters, i.e., tracts with large numbers of people living in residential arrangements where an entity or organization provides food and housing. I would also like to delve deeper into the interplay between ethnicity and income and the disparities in food access there.
Links to datasets