Does Density Determine COVID-19 Destiny? Let’s look at the Data

10 min readApr 6, 2020

On March 22, Andrew Cuomo, the governor of New York, described the level of density in New York City as “Destructive” and called for “an immediate plan to reduce density”. This was not the first time that New York City was being criticized for its high levels of Density. However, this time Governor Cuomo’s criticism was carrying a new weight since it was blaming density for the rapid spread of COVID-19 in the city. As of April 5, New York City had close to 67,552 cases of COVID-19 making it the city with the highest number of cases nationally, by a high margin.

Density has become a widely discussed topic in -mostly virtual- public and professional circles. One group believes that cities have always been the hotbed for epidemic and the current pandemic is only another externality of high-density urban areas next to air and noise pollution. In an interview with New York Times, Dr. Steven Goodman, an epidemiologist at Stanford University, suggests that cities “with large population centers, where people are interacting with more people all the time“ are the places where the virus is “going to spread the fastest.” The other group points out to Asian cities that have higher levels of density compared to New York but have been very effective in controlling the virus. They argue that the problem is not the density but inaction and the lack of proper infrastructure. Emily Badger writes in the New York Times that density is what makes New York City resilient in the face of a crisis. Her position is aligned with a longstanding tradition in urban planning that promotes high-density cities as sustainable, walkable and innovative.

Investment in American downtowns has been steadily increasing for the past few decades after the dominant post-world war suburbanization. We have to expect this pandemic to have a lasting impact on these investments and the public attitude towards the densification of cities. Responding to these transformations requires a close evaluation of the situation by a variety of disciplines including public health, urban policy, and planning, design & architecture. Such an inquiry demands going beyond opinions biased by disciplinary tradition or personal preference and looking at the facts of the situation using data.

Our goal in this article is not to come to a definite conclusion on the relationship between density and the spread of COVID-19 but to start a data-informed discussion on the topic. We are sharing the packaged Tableau workbook so researchers can explore this data from other perspectives. We will be using the dataset provided by Corona Data Scraper Website which “pulls COVID-19 case data from verified sources” such as the department of health and human services. The website records the number of cases in the United States on a county level every day.

Question and Approach

The concern with these discussions is that density is being used as a broad term to prove a predetermined theory. The broad question can be summarized as follows: Does higher density lead to the spread of infectious diseases like the COVID-19 and hence result in a higher number of cases? On the surface, the answer to this question seems to be yes. Metropolitan areas across the United States show the highest number of COVID-19 cases. Results are attributed to the physical proximity and public transportation systems in these cities.

Density and Cases of COVID-19 in U.S. counties

To push the boundaries of our analysis, we need to ask more questions: What is the level of density that we are talking about? How do these impact our hypothesis about the role of density? Are counties infected with COVID-19 denser than the ones that are not infected yet? Do people use more public transportation for their commute in these counties, and is that a significant contributing factor?

We approached these questions using two different models. First, we utilized regression analysis to identify any correlations between public transportation, density, and cases of COVID-19 in a county. Next, we used a threshold-based model to identify whether there are certain levels of density that change the probability of a county getting infected with the virus.

Regression Model: Is there a positive correlation between the density, public transportation commute and cases of COVID-19 in urban areas?

To answer this question we focused on the number of cases per capita in a county and not the total number of cases. Case per capita presents a more reliable measure since it focuses on the relative density of cases and not the absolute number of cases. The hypothesis here is that density raises the number of cases per capita. The analysis of 1134 counties with positive cases of COVID-19, however, doesn’t show a positive correlation between the two parameters. Therefore, an overall claim on a national level can’t be made that higher density equals higher cases per capita. However, one can argue that including small cities to evaluate the impact of density doesn’t make sense. Therefore, we narrowed down our analysis to 532 counties with a minimum population of 100,000. In this case, a weak positive correlation (R-squared = 0.13, p<0.01) is observed. If we exclude New York state counties that have the highest cases of COVID we get a weaker positive correlation (R-squared = 0.07, p<0.01).

Results for the correlation between public transportation commute rate and case per capita are not so different. A positive yet weak correlation (R-squared = 0.15, p<0.01) exists between case per capita and public transportation commute rate. Similar to the density analysis, correlation gets weaker (R-squared = 0.07, p<0.01) when we exclude 4 extremely high values of New York counties.

*The correlation between density, public transit commute rate and case per capita is weak and inconsistent*

The national-level analysis ignores the different testing capacities of different states. Also, it should be noted that the number of cases in a county is affected by the overall spread in the state. Therefore, a state-level analysis might reveal a more realistic perspective of the relationship between density and the spread of the virus.

The state-level analysis doesn’t show a statistically significant positive correlation between the two parameters except in New York which shows a consistent positive correlation between cases per capita and density (R-squared = 0.17, p < .01) and public transportation commute ratios (R-squared = 0.25, p < .01). Counties such as Bronx, Brooklyn and New York are the densest counties in NY which also have the highest cases per capita.

While our study shows that urban areas have more cases per capita compared to rural areas, on a national level, no overall correlation was found between the level of urban density, public transportation commute ratios and cases per capita. On the state level, only New York showed strong positive correlations between the two parameters.

Threshold Model: Is there a density threshold that changes the chances of a county getting infected by the virus?

While in the first model we were looking for a correlation between parameters, in this section we look for a threshold that raises the odds of an outbreak in the city. The threshold approach to urban density and pandemics has been used previously by Chandra et al. (2013). First, we compared the density and public commute ratio of counties with no cases of COVID-19 and ones that have at least one positive case. The average density of counties with at least one case of COVID-19 is 534 people per square mile while the number is 73 for counties without any COVID-19 cases. This does not sound surprising. It’s expected that denser counties that have airports are connected to the highway system and in general, have more population to have at least one case of COVID-19. As the graph below shows most counties without identified cases of COVID-19 have a density value of 100 people per square mile. Also, in the majority of these counties, less than 1% of the population uses public transportation for commuting.

If a county has a density lower than the 100 people per square mile threshold, there is a 36% chance that it has no identified cases of COVID-19. This number is 2% for counties with a density value higher than 100 per square mile threshold.

100 people per square mile density threshold mark the counties with no cases of COVID-19

However, the difference between urban and rural densities is not the point of contention here as discussions have been mostly centered around optimizing the density of cities and not between urban and rural levels of density. Such discussions assume an optimal level of density in a city that if exceeded, outbreaks are becoming inevitable or much harder to control. Therefore, a more interesting question to ask is whether there is a density threshold that divides highly-infected counties from the ones mildly impacted by the virus. Density thresholds are not a new idea and previous studies on pandemics have looked at this factor carefully for the 1918 pandemic but warrant further study by urban planners and geographers.

For this article, we define the level of infection at 1000 people per capita. By looking at the graph below, the density of 1000 people per square mile and a public commute ratio of 0.01 gives us such a value. There are 2293 counties with a density value of 1000 people per square mile and smaller and 2% of them have a case per capita of 0.001 or larger. On the other hand, only 108 counties have a density larger than 1000 people per square mile and 24% of them have a case per capita of 0.001 or larger.

Counties with a case per capita of 0.001 and higher are marked in orange. The circle radius shows the case per capita. Queens County and SFO are highlighted which have similar commute-density ratios but very different COVID cases per capita.

If a county has a density lower than the 1000 people per square mile threshold, there is only a 2% chance that it has been highly-infected by COVID-19. This number is 23% for counties with a density value higher than 1000 per square mile threshold.

This threshold changes our perspective on the relationship between density and the spread of the virus. While there are no strong correlations between density, public commute value and COVID-19 case per capita, there are density thresholds that if passed considerably change the odds of infection.

What do these results mean for the future of cities?
Density alone doesn’t define destiny. Preparedness is Key.

Our analysis shows that while density does not determine the spread of the virus, it could increase the odds of infection in a county. Highly-dense counties are often more connected internally by public transportation systems and externally via international airports. These connections create a network effect that makes these counties hubs of interaction and innovation, and often the excellent quality of healthcare. The same connections, ironically, can result in a faster spread of a virus.

However, if density were the only factor it would not explain the difference between San Francisco and Queens County. While San Fransisco County and Queens County have very similar levels of density and public transportation commute, Queens has 15 times more cases per capita. Rather than blaming density for the spread of COVID-19, we need to consider the fact that cities with different densities have different dynamics that require different modes of design, planning, and action.

A highly-dense city such as New York offers possibilities for mobilizing actions that are next to impossible in a suburban or rural situation. Yet, in the face of a pandemic, it was taken by surprise. Singapore, a city with a higher density than NY was able to contain the pandemic rapidly by early detection and quarantine, contact tracing, social distancing, rapidly deploying healthcare and alternate healthcare sites, and putting a robust system in place, focused on containment & mitigation.

Density cannot be ignored when planning and designing for pandemics. But it must be addressed within the context of the vulnerabilities and possibilities of a specific urban density, with an understanding that beyond a specific threshold- cities become hyperdense and require hypervigilance. We also have to think about fundamentally resilient health systems that form the foundation of a network of cities instead of concentrating foundational human health resources in a few mega-centers. Perhaps its time for Urban tech and Urban Health to come together inequitable and agile ways so we can be prepared for a pandemic without fundamentally compromising the facets of a city that make it thrive.

In this debate about density, data serves as a valuable window into how we can approach urban planning. Our findings beg for a continued debate about density,” one that looks for both incidence and mortality, and includes the lens of equity, social experience, technology, and environmental quality.

In our next discussion, we will look at the growth rate and the ability to flatten the curve- and study what this means for planning and design. We look forward to engaging with diverse perspectives.

Authors:

Babak Soleimani, Ph.D., Data Scientist HKS Inc.
Upali Nanda, Ph.D., Assoc. AIA, Director of Research, HKS Inc., Associate Prof. of Practice, Taubman College of Architecture and Urban Planning
Sheba Ross, AICP CUD, CDT, International Assoc.AIA, Senior Urban Planner and Designer HKS Inc.

Does Density Determine COVID-19 Destiny? Let’s look at the Data

Question and Approach

Regression Model: Is there a positive correlation between the density, public transportation commute and cases of COVID-19 in urban areas?

Threshold Model: Is there a density threshold that changes the chances of a county getting infected by the virus?

What do these results mean for the future of cities? Density alone doesn’t define destiny. Preparedness is Key.

Written by CADRE Research

What do these results mean for the future of cities?
Density alone doesn’t define destiny. Preparedness is Key.