Introducing GeoCrossWalk, a new API to accelerate your data analysis in LA County and beyond

Applied Research in Government Operations
A.R.G.O.
Published in
4 min readMay 11, 2018

This post is the second in our series from ARGO’s civic data marketplace.

The first post can be found here and the second can be found here.

Geo Street Talk is envisioned as a GIS tool to translate on street locations to human descriptions of street segments (“between such and such street and other avenue”).

This work is part of a larger experiment to test incentives for public data talent to deliver civic data science projects.

Long term we hope to pioneer a new pathway for local governments to tap into talent at a lower cost and greater quality than current contracting methods.

Are You Working with Multiple Datasets of LA County? Let Geo-Crosswalk Help You Speed Up Your Analysis

by Gaurav Bhardwaj, Lingyi Zhang

If you’re a civic-tech person working with urban data, maybe you’ll resonate with this scenario:

You’re working on a project and then you realize that two or more datasets you plan to use are aggregated at different geographies.

For example, one aggregates at the city council level and the another, at regional school districts. You spend hours converting the geographies to merge these datasets. The frustration can often be quite palpable. Have no fear, however!

Crosswalk files are quite common at the federal administrative data level. The US Census offers relationship files that link up the various census geographies (see image below) so that census data can be analyzed across different “administrative geographies” (census tract, census block, MSA etc.)

(L) Census Geographies courtesy ESRI ; (R) Hierarchy of Census Geographies courtesy Census department

However, these relationship files often do not link up with “local geographies” (neighborhoods, community and school districts, sanitation, fire, water, health and similar geographies)

Use case for a local geo-crosswalk:

An analyst at the City of Los Angeles is collaborating with a peer at the County and are looking to develop an integrated plan to address homelessness in downtown LA.

  1. Regional planning authority staff are looking to quickly look at the impact of various demographic forecast scenarios across jurisdictions beyond city boundaries (for example school district or water district boundaries)
  2. A researcher at a local university is looking to do a comprehensive comparative analysis of fees charged by all local governments in a specific area

A Geo-crosswalk — One file that binds them all

We present to you a geo-crosswalk, a service built for LA county and New York City. This portal will retrieve all the geographies associated with any given point in the two cities.

Geo-crosswalk converts addresses and position coordinates to a list of corresponding local geographies that this address or point falls in.

Local geographies were sources from the open data repositories for LA County and NYC. For LA, these include:

  • Law Enforcement Reporting Districts
  • LA County Communities
  • Registrar Recorder Precincts
  • Congressional Districts
  • State Senate Districts
  • State Assembly Districts
  • 2011 Supervisorial Districts
  • Area Names
  • 2010 Census Tracts
  • 2010 Block Groups
  • 2010 Census Blocks
  • Federal Information Processing Standard Numbers
  • City Name
  • Postal Code
  • Public Use Microdata Areas (PUMA)
  • Health Districts

Geo-Crosswalk can be consumed via:

Create your own Geo-Crosswalk

If you find this Geo-Crosswalk useful producing one on your own is fairly straightforward:

  1. Aggregate the shapefiles of all local geographies in your city / county.
  2. Using QGIS (or other GIS software), use a spatial overlay functon to merge all the local geographies into a single file.
  3. Convert the shapefile into a geo-json file and upload it into a postgres databse. (ogr2ogr is a great resource for this)
  4. Create an API to query the database. We used Amazon Web Services (AWS) RDS, Lambda, and API Gateway to create ours. Please check our GitHub for more details.

Feel free to create an issue on the gihub repo to begin a conversation on this project.

Authors

Gaurav Bhardwaj is a graduate student at NYU CUSP, pursuing MS in Urban Informatics. He is a spatial data enthusiast and has been taking advanced level courses in spatial analytics. His interests include creating new data driven products aimed towards smart cities.

Lingyi Zhang is a graduate student at NYU Center for Urban Science and Progress (CUSP). She loves developing data-driven solutions for urban questions.

--

--

Applied Research in Government Operations
A.R.G.O.
Editor for

Startup non-profit building data infrastructure for public service delivery. Team staffing @cadc_io