Anyone can publish on Medium per our Policies, but we don’t fact-check every story. For more info about the coronavirus, see cdc.gov.

Deriving hospital travel times with population-weighted sampling

April 21, 2020 by Eric Buth
To explore more data on COVID-19, please go to covid19.topos.com

Topos
Topos
Apr 22 · 6 min read

One of the early stories to emerge concerning COVID-19 was the vulnerability of certain communities in the U.S. caused by a lack of access to critical medical care typically provided in hospitals. Today, Navajo County, Arizona, home to three Native American reservations, the Navajo Nation, Hopi Indian Reservation, and Fort Apache Indian Reservation, is experiencing one of the country’s highest per capita cases of COVID-19 (435 cases per 100K people). What makes the county’s high rate especially alarming is the lack of points of medical care able to treat serious COVID-19 patients. To understand vulnerability through the lens of access to critical medical care, we generated a new feature, “ for all counties in the U.S. Here we detail our methodology for generating this feature at the county level.

Image for post
Image for post
Navajo County COVID-19 Cases per 100k people with Median Distance to Nearby Hospitals (in seconds) April 21, 2020

Working effectively with geographic data from multiple sources often requires a strategy for translating between geographic units. This translation is a non-trivial technological challenge, but in the case of COVID-19 can prove important in answering simple but critical questions such as:

At Topos, we maintain a large amount of categorized information about the locations of businesses, public institutions, etc. — sometimes referred to as “points of interest” or “POI.” As the geographic resolution of POI data is effectively infinite (they are points in space), the core challenge is how to aggregate these points so they can be used in relation to features available at higher levels of granularity such as counties and states.

The most straightforward way to approach this aggregation is to simply count the number of points that fall within a larger geography, a strategy we take with other relevant POI such as housing units. However, there are cases where these numbers don’t fully capture the relationship that people have with the resources being counted.

Image for post
Image for post
The nearest hospital to Spray, Oregon, is in neighboring Morrow County, over an hour away.

What if the nearest hospital to a large part of a county’s population is actually in a neighboring county or several counties away? What if most of the grocery stores in a county are located far away from where the residents of the county live? What if a county represents the outer suburbs of a major city or is abutted by national park land? Simply counting points doesn’t sufficiently capture their accessibility, which is particularly important for critical resources like hospitals, urgent care centers, pharmacies, grocery stores or schools.

Rather than simply counting points within a region, we may want to have a sense of how easily those points can be accessed. To this end, we often look not only at geographic distance to these points but actually calculate how long it takes to reach these points via common modes of transportation (drive, walk, etc).

Image for post
Image for post
Subway time to selected points in NYC visualized on level 16 S2 cells.

In this project, we begin with the time it might take an ambulance to reach the nearest hospital that has in-patient services — that is, a hospital with bed count greater than zero.

In order to decide which of the thousands of hospital locations are the to a given address, we use an S2-based geospatial index, which allows us to quickly search radiuses around those addresses to build candidate lists. This step is important because of the amount of time and resources it would otherwise take to evaluate the travel time to every hospital in the country. Once we have the reduced list of locations that are — as the crow flies — we then need to determine which one is actually the quickest to drive to on available roads.

For the same reasons we needed hospital candidate lists, we now need a method for limiting the number of addresses we use as origins — the location from which to compute the path to using a routing API (here.com, Google Maps, Mapbox, etc.). To accomplish this we construct a sample — a meaningful subset of addresses that roughly represents the entire county.

One way to build this sample would be to pick at random within a given county’s geographic boundaries. However, sampling in this manner risks significantly over-representing less densely populated areas. For example, in Deschutes County, Oregon, this approach results in selecting as many points from national forest land as from the city of Bend.

Image for post
Image for post
Random point sampling in Deschutes County, Oregon
Image for post
Image for post

The resulting travel times to nearby hospitals appear to be evenly distributed. Intiutively, this seems wrong: human geographic organization tends to concentrate both resources and population, and in Deschutes County at least 3 cities exist that should push the distribution away from this apparent randomness. If county residents are significantly more likely to live in a city with a nearby hospital, we’d expect that to be reflected by a concentration of values around a lower median travel time.

Image for post
Image for post
County, Tracts and Blockgroup boundaries of Deschutes County, Oregon

We adjust our sampling strategy to account for this issue by using the population counts of census block groups, which are significantly smaller than counties. We’re not producing final metrics at such a low level, but we can use the more granular population counts to weight our random sample of starting points. Imagine that for every person in a county we put one marble, labeled with the block group where that person lives, in a bucket. To get a population-weighted sample, we repeatedly pick a marble from the combined bucket — replacing it each time — and note the label.

Image for post
Image for post

The effect is that every person has an equal chance of being selected, even though the resulting block group counts are not themselves equal. Once we have constructed this list of block groups, we then pick a random address within their geographical bounds.

Image for post
Image for post
Population weighted point sampling in Deschutes County, Oregon
Image for post
Image for post

This sampling strategy now shows points clustered around three cities within Deschutes County, the population centers of Bend, Redmond, and Sisters — with some outliers along highway 97. The hospital travel time values now form something closer to a normal distribution, with a median around 13 minutes — a stark difference from the likely misleading 50 minutes of the previous example.

With our metric in hand, we can now examine it in relation to the rapidly unfolding crisis of COVID-19. The visualization below highlights which counties have high per-capita COVID-19 infections with the lowest access to Hospitals (As of April 21, 2020)

Image for post
Image for post

To explore more data on Covid-19, please go to covid19.topos.com

topos.ai

topos.ai

Topos

Written by

Topos

Transforming the way we understand cities with Artificial Intelligence | @topos_ai

topos.ai

topos.ai

topos.ai

Topos

Written by

Topos

Transforming the way we understand cities with Artificial Intelligence | @topos_ai

topos.ai

topos.ai

topos.ai

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store