A performant Mapbox implementation of Census Tracts at all zoom levels

A walkthrough of how to create an optimized and performant responsive census tract choropleth map with Mapbox

Christopher Lanoue
Graphicacy
5 min readNov 17, 2020

--

A animated zooming video of the median household incomes, by census tract, in and around Cook County, Illinois.
A performant and responsive view of Median Household Income at the census tract level in the state of Illinois

One of the most common ways to visualize data is through mapping. However, in presenting data in a map format, a multitude of options must be considered. First, what type of visualization will work best? A filled (“chloropleth”) map, a symbol map, a density map, a hexbin map? The chosen form will dictate, to some extent, the performance of the final visualization.

Let’s look at a case study where the Graphicacy engineering team approached the problem of how to map 73,000+ United States census tracts of highly varied shapes and sizes. For most DIY data visualization tools, the final result would be nightmarishly slow to render at even the highest zoom level, (never-mind granular concentrations of tracts in metropolitan areas). But this team proved that with proper planning and execution, even a highly dense map reliant on shapefiles, could be made into an agile tool with Mapbox open source technology.

Data

Within the US Census Bureau there are census tract cartographic boundary shapefiles for each US state and territory. It is possible to manually download and extract each compressed zip file to your local machine, but because census tracts change over time it is beneficial to have a repeatable approach.

In the below implementation, we go through each of the FIPS codes for the 50 US states, download the compressed zip file, extract the shapefiles, and upload the shapefiles via shp2pgsql to a local PostgreSQL database with the PostGIS extension installed. This process assumes that you are running OS X, have installed PostgreSQL, and have the shp2pgsql CLI tool — which oftentimes come bundled with PostgreSQL.

Data Processing

Once the shapefiles are loaded into PostgreSQL, we can use SQL and the ogr2ogr CLI tool to combine the geometries and save as a GeoJSON file. The output of the below script is a 1GB+ JSON file.

ogr2ogr -f "GeoJSON" tracts.geojson \
-sql "SELECT geoid, geom FROM tl_2019_tracts" \
PG:"host=localhost user=postgres dbname=gis_db port=5432" \
-progress

One issue, shown below, is that the raw census tract shapefiles extend out into the ocean and other bodies of water; so, we trimmed the boundaries of the census tracts by the state boundaries to get a cleaner, and more recognizable, picture.

A raw census tract map for the Northeast of the United States with boundaries extending into the ocean.
A view of the raw census tract tiles in the Northeast

To clip the census tracts by the state boundaries, we need to first download and process the US State boundary shapefile from the US Census Bureau. Once that shapefile is in our database, we can run the updated SQL and ogr2ogr script below to retrieve a new GeoJSON file for the clipped geometries — which should come in around the same 1GB+ file size.

ogr2ogr -f "GeoJSON" tracts-clipped-by-state.geojson \
-sql "SELECT tracts.geoid, (ST_Dump(ST_Intersection(states.geom, tracts.geom))).geom geom FROM cb_2018_us_state_500k states INNER JOIN tl_2019_tracts tracts ON ST_Intersects(states.geom, tracts.geom)" \
PG:"host=localhost user=postgres dbname=gis_db port=5432" \
-progress
A view of the clipped census tract tiles in the Northeast

Upload and style in Studio

The next step for creating a performant census tract map across all zoom levels is to create and upload a vector tileset from our clipped GeoJSON dataset to Mapbox. For creating the vector tileset, we turned to the open-source tool Tippecanoe, from Mapbox. Tippecanoe takes a GeoJSON, CSV, or GeoBuf file and outputs a Mapbox tileset. Tippecanoe offers many different options and recipes and we chose to follow along with the recipe related to continuous polygon features, visible at all zoom levels:

tippecanoe -zg \
-o tracts_2019_clipped.mbtiles \
--coalesce-densest-as-needed \
--extend-zooms-if-still-dropping -aI \
--use-attribute-for-id=geoid tracts-clipped-by-state.geojson

After a few minutes of processing and behind-the-scenes magic, we have a single optimized package of census tracts tiles at all zoom levels, which is easily uploaded into Mapbox Studio as a new Tileset by dragging and dropping.

A dashboard representation of the census tract dataset uploaded into Mapbox Studio.
The clipped census tracts tileset uploaded into Mapbox Studio

Add Styles

Once the tileset is uploaded into Mapbox Studio, we need to add a new Style and attach this dataset to that new Style. We chose to keep the default fill and stroke styles because we wanted to do the styling and painting on the client-side with Mapbox Expressions.

The Mapbox Studio Style panel for assigning styles to all features and properties of the census tract tiles.
Adding styles to the clipped census tracts tileset in Mapbox Studio

Render with Mapbox GL JS

A blue sequential threshold legend for Median Household Income, by census tract, in the United States
A threshold legend for Median Household Income, by census tract, in the United States

After everything is ready and saved in Mapbox Studio, we can retrieve the tiles with their corresponding styles on the client-side using Mapbox GL JS. Then we can render the choropleth map using Mapbox Expressions.

For this example, we found that the census tract data provided by the diversitydatakids.org group at Brandeis University was the easiest to consume and join to the census tract geometries — by geoid. Specifically, we were interested in exploring median household incomes from the American Community Survey (ACS) across the country, so we loaded that dataset into our database via their easy-to-use API, joined with the census tract geometries, and, subsequently, used Jenks optimization to find breaks in the values for coloring the map — see legend image above. Lastly, we opted for a sequential blue color scheme courtesy of ColorBrewer for showing the breaks in the data.

Mapbox expression for creating a choropleth map on median household income

After solving the problem of creating a performant responsive web application for mapping 73,000+ census tracts in the United States at all zoom levels, the engineering team at Graphicacy is looking to the next mapping challenge. If your organization is struggling with how to best display your geographic data, please reach out and let us help you build a performant and optimized user experience.

Christopher Lanoue is a Creative Technologist/Data Visualization Engineer who is currently the Director of Engineering and Innovation at Graphicacy with a focus on designing and building innovative and creative solutions for mission-driven clients all over the world.

--

--