How Accurate is Matching ZIP Codes to Legislative Districts?
For approximately 22.6 million Americans, knowing your ZIP code might not be enough to find out who represents you in Congress.
[caption id=”attachment_10381" align=”aligncenter” width=”640"]
Matching constituents to legislative districts using only a ZIP code is not always accurate.[/caption]
In our work on Cicero, we offer solutions for matching constituents with their representatives based on our comprehensive database of legislative districts. Cicero’s latitude/longitude and address-level matching provide high accuracy matching at the national, state, and local levels, but sometimes you just want to get the job done with a five-digit ZIP code.
The Problem with ZIP Codes
ZIP codes are useful because they’re widely adopted, commonly understood, and don’t require personal information like street addresses. Technically, a ZIP code is a set of postal routes and not a defined geographic region, but the Census Bureau provides a ZCTA (ZIP Code Tabulation Area) dataset that can serve as a proxy for ZIP code boundaries.
However, ZIP codes don’t always follow boundary lines like the edges of legislative districts. There remains uncertainty in matching ZIP code residents to their exact legislative district. ZIP codes can overlap with multiple districts, making any “ZIP to District” service inherently inaccurate. It’s impossible to say which legislative district represents the residents of each ZIP code because sometimes there are several.
[caption id=”attachment_10375" align=”aligncenter” width=”640"]
The 30252, 60629, and 78666 ZIP codes each overlap with four different congressional districts.[/caption]
We decided to calculate some statistics on the accuracy of ZIP to district matching to see whether we could reduce the error rate with a different approach. Pulling from our district boundaries database and US Census Bureau population data, we were able to better quantify the issue.
[caption id=”attachment_10376" align=”aligncenter” width=”640"]
The US Census Bureau provides detailed geographic data about population distribution. On the map above, darker areas represent more people and state boundaries are shown in blue.[/caption]
ZIP Centroid to District
One approach to matching a ZIP code with a legislative district is determining which district represents someone living in the very center of a ZIP code’s ZCTA. Then, assume every other resident of the ZIP code is in the same district. We’ll call this the “ZIP Centroid to District” method.
Across the United States, our analysis found a 92% accuracy rate for matching constituents to congressional districts with the “ZIP Centroid to District” method. These results vary and some regions perform better than others. Still, approximately 22.5 million people may be improperly matched to their representative in Congress if only ZIP codes are used.
[caption id=”attachment_10377" align=”aligncenter” width=”640"]
Regions showed in red where the “ZIP to Centroid” method returns an inaccurate match for congressional districts.[/caption]
The problem affects all levels of representation. Matching is even more challenging with smaller districts in state upper and lower legislative chambers. For these state chambers, ZIP codes are less likely to be contained entirely within a single district because there are many small districts (e.g. New Hampshire has 204 legislative districts for its House of Representatives).
At the state level, our analysis found an 85% accuracy rate for upper legislative chamber districts and a 75% accuracy rate for lower legislative chamber districts. Inaccuracies are especially concentrated in urban areas.
[caption id=”attachment_10378" align=”aligncenter” width=”640"]
Regions showed in red where the “ZIP to Centroid” method returns an inaccurate match for state upper legislative chamber districts.[/caption]
[caption id=”attachment_10379" align=”aligncenter” width=”640"]
Regions showed in red where the “ZIP to Centroid” method returns an inaccurate match for state lower legislative chamber districts.[/caption]
Best Guess with Population Analysis
If we don’t know which of several districts represents a ZIP code, we could still determine which district represents the most people in the ZIP code. Using this approach, we can assign a district to each ZIP code based on the district most likely to represent a resident.
With a “ZIP to Best Guess Based on Population” method, the accuracy rate improves to 94% for congress, 89% for state upper legislative chambers, and 81% for state lower legislative chambers. For some purposes, these accuracy rates might be high enough to justify using just ZIP codes instead of full address matching.
Through population analysis, we can also determine the probability of a ZIP code’s resident living in any one of several possible districts. This information allows setting a threshold for when a ZIP code to district match is too uncertain and requires falling back on an address-level matching service.
Start Matching Constituents to their Representatives
To learn more about ZIP code matching and our analysis, reach out to the Cicero Team. If you’ve dealt with ZIP to District inaccuracies in the past, we’re interested to hear how you handled them.
To match constituents with their representatives at the highest possible accuracy, consider using the Cicero API. Have a spreadsheet of addresses that immediately need to be matched to their elected representatives? Check out Cicero’s District Match.