The Startup
Published in

The Startup

How to Create County Boundary Maps Only of Populated Areas

Use BigQuery GIS to trim county boundaries to include only pixels with population density > 10 people per sq km and use these boundaries in your plotting programs.

Population Density Raster Data

Joining a vector dataset of county boundaries against a raster dataset of population density can be quite hard. Earlier, I have written about how you can efficiently query raster data in BigQuery by demonstrating on the global population density dataset published by Columbia University. Here, I am using that raster dataset (I’ve made it public) and joining it against US count-level data to trim the county boundaries in Washington State to just populated areas:

WITH populated_areas AS (
SELECT bounds
FROM `ai-analytics-solutions`.publicdata.popdensity_sedac_rectangles
WHERE
year = 2020 AND
tile = 'gpw_v4_population_density_rev11_2020_30_sec_1.asc' AND
population_density > 10 -- persons per square kilometer
)
SELECT
state_fips_code, county_fips_code,
ANY_VALUE(county_geom) AS tract_geom,
ST_UNION(ARRAY_AGG(bounds)) AS populated_tract_geom
FROM
`bigquery-public-data`.geo_us_boundaries.counties,
populated_areas
WHERE
state_fips_code = '53' AND
ST_CONTAINS(county_geom, bounds)
GROUP BY 1, 2
  • You get 39 results, one for each county in Washington
  • While the original tract_geom is a POLYGON, the populated_tract_geom is a MULTI_POLYGON since there are multiple pixels in each county

Full dataset (for whole US)

Let’s do this for the whole country, not just Washington, and send the output to a table for easy querying:

CREATE OR REPLACE TABLE publicdata.us_tracts AS
WITH populated_areas AS (
SELECT bounds
FROM `ai-analytics-solutions`.publicdata.popdensity_sedac_rectangles
WHERE
year = 2020 AND
(tile = 'gpw_v4_population_density_rev11_2020_30_sec_1.asc' OR
tile = 'gpw_v4_population_density_rev11_2020_30_sec_2.asc'
)
AND
population_density > 10 -- persons per square kilometer
)
SELECT
state_fips_code, county_fips_code,
ANY_VALUE(county_geom) AS tract_geom,
ST_UNION(ARRAY_AGG(bounds)) AS populated_tract_geom
FROM
`bigquery-public-data`.geo_us_boundaries.counties,
populated_areas
WHERE
ST_CONTAINS(county_geom, bounds)
GROUP BY 1, 2
ai-analytics-solutions.publicdata.us_tracts
Query to create county boundaries where the boundaries of populated pixels within each county.

Example of using the county boundaries to plot weather-related fatalities

As an example of using the county boundaries, let’s plot the number of weather-related fatalities in the US in the 2010s:

WITH historical_weather AS (
SELECT
CONCAT(LPAD(state_fips_code, 2, '0'), LPAD(cz_fips_code, 3, '0')) AS county_fips_code,
SUM( injuries_direct + deaths_direct + deaths_indirect + injuries_indirect ) AS num_impacted
FROM `bigquery-public-data.noaa_historic_severe_storms.storms_201*`
GROUP BY 1
)
SELECT
tract_geom, populated_tract_geom, (num_impacted/population) AS impacted
FROM
historical_weather
JOIN
`ai-analytics-solutions.publicdata.us_tracts`
USING(county_fips_code)

Exporting table AS CSV

If you are not doing your GIS analysis in BigQuery, you might want to export the populated county tracts. You can do that by doing this in the UI:

SELECT
state_fips_code, county_fips_code,
ST_AsGeoJSON(populated_tract_geom) AS county_tracts
FROM `ai-analytics-solutions.publicdata.us_tracts`

--

--

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +768K followers.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store