Dasymetric reaggregation using Mapshaper

Denise Lu
5 min readMar 1, 2020

--

Or, how to use data collected in one geography to characterize another overlapping but different geography. In my most recent example of this, I made a cleaner workflow using Mapshaper.

I compiled some demographics data for the precincts results pages for The New York Times. The problem is that the data comes from Census tracts, but our map was presented as precincts, and these two geographies vary quite a bit.

One of the traits we were curious about was age. What share of our precinct’s population were older residents (aged 60 and up)? On initial thought, it might be tempting to just get the share of older residents from the tract our precinct overlaps with and use that metric. However, that assumes that the population is evenly distributed throughout the entire tract, which we know isn’t true. Plus, our precinct overlaps with four tracts, which presumably all have different shares of older residents.

The solution is to use Census blocks, the smallest geography the two geographies have in common, as a go-between. You can download Census blocks with population data here, named by state FIPS codes.

Census blocks will be a proxy to distribute population throughout the larger geographies. There are basically three steps:

  1. Spatial join the blocks to the source geography (Census tracts) and count up its total population from the blocks’ population data.
  2. Spatial join the source geography BACK to the blocks to calculate how much “weight” each block should have based on its share of the total population in the source geography, and use this weight to approximate data attributes from the source geography down to the block.
  3. Spatial join the blocks to the destination geography (precinct) and add up the weighted data values of all the blocks in the destination geography.

I decided to use Mapshaper in node.js this time (usually I only use Mapshaper as a command-line tool) because it was easier to deal with variable names and data fields this way.

Here’s the top of the file, which just sets up some variables and calls all the functions:

const mapshaper = require('mapshaper');// tracts geo to grab census data from
const srcGeo = 'tracts.json';
// precinct geo to reaggregate data to
const destGeo = 'precinct.json';
// census blocks
const blocksGeo = 'blocks.json';
// steps to run through
async function runAll() {
await joinBlocksTosrcGeo();
await weightBlocks();
await joinWeightedBlocksTodestGeo();
}
runAll();

The first step is to join blocks to the source geography, tracts.

async function joinBlocksTosrcGeo() { let cmd = `-i ${srcGeo} -join ${blocksGeo} calc='TOTPOP10=sum(POP10)' -o force format=geojson tracts_with_blocks_pop.json`; await mapshaper.runCommands(cmd);}

In this step, I figure out which blocks overlap with the source geography and calculate a variable TOTPOP10 which sums up the populations of each block ( POP10 ) within the tract.

Here are the attributes of one of our joined tracts. In addition to the original data attributes it already had, there’s now a TOTPOP10 field that is the sum of the population of the blocks within the tracts. It’s important to use this summed up population as the denominator to calculate the “weight” of each block in the next step. Note that I also have a POPULATION field that comes from the tract itself, which I’ll use later.

The second step is to join the tracts BACK to the blocks so each block will have this TOTPOP10 attribute, and I can then calculate its “weight” within the tract.

// data fields in our the census tract geography
const dataFields = [
'POPULATION',
'AGE_OVER_60',
'AGE_UNDER_35'
]
async function weightBlocks() { // make a string out of the fields I want to calculate to feed into the mapshaper command
let dataFieldsConcat = [];
dataFields.forEach((field)=>{
let weight = '(POP10/TOTPOP10)';
dataFieldsConcat.push(`WEIGHTED_${field}=${weight}*${field}`);
// calculates a new weighted field by multiplying the block's weight with the totals from the tract
});
let dataArg = dataFieldsConcat.join(', ');let cmd = `-i ${blocksGeo} -join tracts_with_blocks_pop.json -each '${dataArg}' -o force format=geojson blocks_with_weighted_data.json`;

await mapshaper.runCommands(cmd);
}

In this step, I calculate a “weight” for each block by dividing the block’s population by the population of all the blocks in that tract. Then, I multiply the weight with other data points from the tract, such as population over 60, to approximate the block’s own population over 60.

Here are the attributes of one of our joined blocks. The first POP10 attribute is the block’s population. The next four attributes belong to the tract it’s in. The weight of this block is POP10/TOTPOP10 = 34/3694 = 0.0092 , which means that this block has approximately 0.92 percent of the tract’s total population. If I then multiply this weight by the tract’s total population over 60, I get 774 * 0.0092 = 7.12 , its WEIGHTED_AGE_OVER_60.

One important note is that these weighted populations are obviously approximations and should not be used as definite numbers for the block’s actual population. I can, however, use these weighted populations in aggregate to see trends by summing them up to our precincts, the ultimate geography I would like the data in.

async function joinWeightedBlocksTodestGeo() {// make a string out of the weighted fields I want to sum up to feed into the mapshaper command
let weightedFieldsConcat = [];
dataFields.forEach((field)=>{
weightedFieldsConcat.push(`REAGGREGATED_${field}=sum(WEIGHTED_${field})`);
// calculates an approximate total demographic population within the precinct by summing up all the weighted demographic populations of the blocks within the precinct
weightedFieldsConcat.push(`SHARE_${field}=sum(WEIGHTED_${field})/sum(WEIGHTED_POPULATION)`);
// calculates a share of the weighted demographic populations within the precinct by dividing by the weighted total population
});
let dataArg = weightedFieldsConcat.join(', ');let cmd = `-i ${destGeo} -join blocks_with_weighted_data.json calc='${dataArg}' -o force format=geojson reaggregated_precincts.json`;await mapshaper.runCommands(cmd);}

In this step, I sum up the weighted populations of the blocks within the precinct to come up with estimates for the precinct’s populations. I’m also calculating a share of the different demographics within the precinct because that was what we wanted to use for analysis.

Here are the attributes of the precinct with reaggregated data.

Again, both of these reaggregated absolute populations and shares shouldn’t be isolated and reported out on their own since they are approximated data points.

However, this method is useful for reaggregation to a large number of geographies so you can see trends across a larger collection, such as thousands of precincts within the state.

Thanks to Matthew Bloch, the creator of Mapshaper who conveniently sits behind me at work and happily answers all my Mapshaper questions.

--

--