A Study Into Location-Based Covid-19 Cases Throughout Los Angeles

Francisco Avalos Jr.
DataLA
Published in
2 min readJan 8, 2021
Mapping of reported Covid-19 cases

Tracking the path of an infectious disease in any pandemic is valuable information that can help mitigate and reduce spread. Unfortunately, major cities throughout the world provide the perfect environment for the rapid transmission of Covid-19.

Photo by Roman Mager on Unsplash

As a Data Angel working with the City of Los Angeles, my goal was to repurpose this model to provide insights at the location-based level. Having the adequate data that can make this possible was naturally the first step in this endeavor. LA Public Health publishes daily Covid-19 cases at the location-based (or points of interest) level throughout Los Angeles county. SafeGraph provides foot traffic data throughout many physical locations throughout the United States. Using these two as the base data building block, I built the Python structure that retrieves, cleans, and prepares this data for our modeling.

Leveraging the free computational power of GPUs (Graphics Processing Unit) offered by Google via Google Colab notebooks, I built the Python code to provide risk scores for these points of interest. Key to any SIR-based model is the number of individuals in the population, this is required information needed to make the model assessment. Our attempt at providing population for points of interest involves averaging foot traffic for these locations for the previous month using the Monthly Patterns dataset provided by SafeGraph.

Running the model shows that care/wellness/nursing/retirement centers, auto dealerships, hospitals, and assisted living locations have the highest risk scores

These findings provide useful precautionary information to visitors of these locations or those similar. It can also help regulators and researchers pinpoint the spread of Covid-19 by location environment throughout Los Angeles. It can further allow us to study how it transmits even with some regulation in place and what can be improved.

P.S. — I’ve collected this data over the course of this project, and to my knowledge, LA Public Health hasn’t built an API to aid researchers and students in easily retrieving this data. Further, due to the nature of the daily publishings, historical published cases are lost to the public. Thus, I’ve released my collected data on Kaggle (below) and plan to maintain it periodically for some period.

--

--