Open Precinct Data

Michal Migurski
Apr 9, 2018 · 4 min read

Several weeks ago, I spent an extended weekend at the fifth (of five) Geometry in Redistricting conference. Apart from speaking and participating in a panel on law, tech, and gerrymandering, organizer Moon Duchin asked for my help organizing the conference hackathon. One theme I heard repeated throughout the event centered on the difficulty of finding reliable precinct geography and election results.

Precinct shapes in Wisconsin, covering Madison (center) and Milwaukee (right)

There’s an opportunity here for a new data project focused on connecting existing academic and independent efforts with durable, unique, permanent identifiers for nationwide voting precincts. Imagine if you could easily correlate detailed voting results from OpenElections.net (OE) or state boards of elections with mapped polygons and census geography over time. We already know how effective a GEOID-based approach can be thanks to data published by the U.S. Census, but precincts are a special challenge without a current champion.

What are precincts good for?

These are just a few uses for precinct-level data. The data is a hot mess, and in need of an organizing effort. It needs a home. Right now, geographic data can be gotten piecemeal from a variety of sources but rarely from state-level authorities who should be collecting and publishing it. Instead, users must know about resources like Harvard’s Election Data Archive (up to 2011) or the ongoing Election Geodata repository that Nathaniel V. Kelso and I maintain. For key newsworthy states like Pennsylvania, it’s a bad sign that both the Washington Post and New York Times cite our volunteer Github repository instead of an official government source.

Existing Precinct Data

What about VTDs? U.S. Census conducted a nationwide collection of Vote Tabulation Districts after 2010. These are easy to confuse with precincts, but a group of Geometry in Redistricting hackathon participants from Duke University, Pennsylvania, and elsewhere showed that VTD and precinct data in North Carolina are not the same. Some precincts cover multiple VTDs, while others don’t match at all.

What about the OpenElections project? OE collects precinct-level vote totals for elections nationwide. Data from returns is collected from counties and states but it doesn’t include geographic boundaries. OE is an ambitious project led by journalists that nicely handles results suitable for election night reporting. Precinct geography data should connect with OE wherever possible, but currently this is a messy and manual process.

What about Voting Information Project (VIP)? In 2012, Google-supported VIP published precinct descriptions for many states. These came in XML format as lists and ranges of addresses I spent a week of quality time connecting them to U.S. Census TIGER data with some success. VIP no longer appears to publish data in this form.

What about data from state election officials, like secretaries of state? A few states proactively publish correct precinct geography linked to specific elections, but most don’t. Precinct areas are often a county-level concern, created and maintained to support local election operations without rolling up to a statewide dataset. Pennsylvania and Maryland don’t offer consistent statewide precinct geography, and datasets for these states must be collected via telephone and 1:1 inquiries. The delivered datasets don’t always match election results, and must be carefully inspected. Counties change precincts continuously (we’re not sure why), so data collected in long after an election may or may not match the precincts in effect during voting.

An Opportunity For A Project

I have some experience with this type of large-scale spatial data project. At OpenAddresses.io we’ve been collecting and organizing worldwide address data for four years, a similar scale of effort to collecting nationwide precinct polygons. At Mapzen, Aaron Cope’s Who’s On First place gazetteer took inspiration from Yahoo!’s Where On Earth IDs to center on the provision of unique and immutable numeric identifiers. A precinct project should address these needs:

This won’t solve the missing “.gov” problem, but with some coordination between universities and independent projects like PlanScore or OpenElections, we should be able to arrive at a mutually-beneficial hub for precinct data that addresses 80% of everyone’s needs and provides a critical backbone for election data research leading up to the 2020 redistricting cycle.

What next? Get in touch if this sounds interesting to you. We’re already building pieces of this puzzle at PlanScore.org to meet our own needs. It’s unproductive to work on a potentially-shared effort in isolation. Let’s share the load!

Thanks to Anne, Deborah, Derek, John, Michael, Nathaniel, Tom, and William for their feedback and encouragement on early drafts of this post.

PlanScore

Measuring partisan gerrymandering

Michal Migurski

Written by

Oakland/SF Bay Area technology & open source GIS. @Remix and @PlanScore, previously at @mapzen, @codeforamerica, and @stamen. Frequently at @geobreakfast.

PlanScore

PlanScore

Measuring partisan gerrymandering

Michal Migurski

Written by

Oakland/SF Bay Area technology & open source GIS. @Remix and @PlanScore, previously at @mapzen, @codeforamerica, and @stamen. Frequently at @geobreakfast.

PlanScore

PlanScore

Measuring partisan gerrymandering

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store