Visualizing US census data with BigQuery and Looker Studio

Marc Soares
4 min readMay 7, 2024

Census data provides an incredibly rich snapshot of the American population. The wide variety of available measures and segments create endless potential for exploration and visualization. In this article, I will demonstrate how to build an advanced geospatial visualization in Looker Studio using Census Bureau data from BigQuery.

The US Census Bureau makes several datasets available through the Google Cloud Public Dataset Program. These datasets are hosted in BigQuery and accessible to anyone with a Google Cloud account. For this exercise, we will focus on data from the American Community Survey (ACS).

The ACS dataset contains annual tables going back 10+ years at different levels of granularity — state, county, zip code, census tract, etc. Now, Looker Studio can already map states, counties, and zip codes based on just the name or appropriate geographic code. So, to demonstrate a true use case for advanced geo mapping, we will explore data at the census tract level.

Let’s use the most recent ACS table and build a query for median age by census tract:

SELECT 
geo_id,
median_age
FROM `bigquery-public-data.census_bureau_acs.censustract_2020_5yr`
WHERE geo_id LIKE "36061%" --Filter for Manhattan

Here we use the WHERE clause to include only tracts in New York County (i.e. Manhattan). The ID of each census tract is prefixed by the county-level ID, and 36061 is the ID for New York County. For more detail on how geo IDs are defined, see this page from the US Census Bureau.

We now have our data broken down by census tract. But to map this in Looker Studio, we also need to provide geographic definitions of the census tract boundaries (often called shapefiles or polygons). Luckily, the Census Bureau also makes the census tract boundaries available in BigQuery.

We can join the census tract boundaries into our query as follows:

SELECT 
acs.geo_id,
acs.median_age,
geo.tract_geom
FROM `bigquery-public-data.census_bureau_acs.censustract_2020_5yr` acs
JOIN `bigquery-public-data.geo_census_tracts.census_tracts_new_york` geo
ON acs.geo_id = geo.geo_id
WHERE acs.geo_id LIKE "36061%"
AND median_age > 0

The median_age > 0 condition excludes census tracts without valid data. In cases where a tract didn’t meet a minimum sample size, the ACS data includes negative value codes like “-666666666”. See the Census Bureau’s Notes on ACS Data for more information on this and other possible codes.

Running this query produces the following results. Notice how the tract_geom contains the polygon definition of the census tract boundary:

We can now place our query directly into Looker Studio by creating a BigQuery data source and selecting the Custom Query option:

When we inspect the data source schema, we should see that the tract_geom field is recognized as a Geospatial field. Looker Studio should do this automatically based on the GEOGRAPHY type in BigQuery:

Now we can build our visualization!

Add a Google Maps chart with geo_id as Location, tract_geom as Geospatial field, and median_age as Color metric:

The result should look something like this:

Median age by census tract in Manhattan

Feel free to play around with the style options to make the data easier to interpret. For example, you can increase the opacity and apply a sequential colour palette.

You can also go back and adjust the query to bring in additional metrics from the ACS like median_rent and median_income:

This exercise just scratches the surface of what’s possible with geospatial data in BigQuery and Looker Studio. Be sure to also check out my article on visualizing Formula 1 Circuits as line maps.

--

--