Every forecaster’s prayer. Photo by Mark Duffel on Unsplash

How Accurate Are Census Population Estimates?

Fairly Good, But Far From Perfect

Lyman Stone
In a State of Migration
8 min readDec 15, 2017

--

Next week, we will get new Census estimates of state and national population. The headlines will focus on estimated population change from July 1, 2016 to July 1, 2017. That’s an interesting story.

But these are just estimates. It’s not actually as if they went out and knocked on doors or something. That only happens in Decennial Census years. So these estimates are a best guess, but do nonetheless have error rates.

As a result, there are often revisions to back-year estimates. These revisions can be very large, especially in years where Census changes their estimation methodology in a meaningful way, as they did last year. As such, while the main press will focus on the year-over-year estimated change, I make a point of analyzing the revisions to back-year data.

Those revisions can be meaningful. Here’s the average state-level population revision size, for each estimate-year and vintage-year, expressed as the average absolute value change in estimated population:

Apologies this chart isn’t interactive today. Having technical difficulties with Datawrapper. Get interactive version here.

As you can see, the final intercensal revisions were way bigger than any of the annual revision series.

But what may be a bit less obvious is that the size of revisions has declined in the current decade. Here’s the revision size for each year-vintage, but instead of calendar year, it’s years-behind-vintage. “0” in V2016 means the V2016 estimate of 2016, while 0 in V2009 means the V2009 estimate of 2009. 3 in V2015 means the V2015 estimate of 2012, while 3 in V2008 means the V2008 estimate for 2005. Basically, it’s a way of lining up years for which vintages had similar amounts of available information, to see if Census’ information-controlled-estimate-volatility has changed over time. I’ll ignore the intercensal numbers because they are so much bigger and have a form of information (Census results) not available to the other vintages.

Link.

As you can see, Census’ revision size is a lot lower than it was. Comparing 2014 to 2004 yields lower revision scale for every year sequence. The same is true for 2015 vs. 2005, except for year 4 where V2015 made a historic revision in a few states, and comparing 2016 to 2006 really shows a big gap.

Some of this is Hurricane Katrina. But then again, the mid-2000s are going to get hurricane volatility in estimates due to Hurricanes Irma, Harvey, and Maria as well!

But on the whole, it seems like Census has reduced the average size of changes. This may be due to method changes. The American Community Survey has improved a lot since the mid 2000s, most components of change data are available earlier than they used to be, and a deeper field of commentary and data work outside of Census may be helping Census improve their estimates.

Or, maybe that’s all wrong, and Census is artificially under-estimating population volatility. One way we can estimate that is to look at the standard deviation in population growth rates for each year shown in each vintage, and compare them to the final intercensal estimate: did Census over- or under-estimate population volatility in the last decade?

It looks as if most estimates did a pretty good job of estimating the amount of variation in growth rates, though on the whole it seems like a bias towards overstatement of growth variance in the estimates, except in 2006 where estimates understated the shock of Hurricane Katrina.

So it doesn’t seem wildly likely that Census is systematically under-estimating the degree of variation in state growth rates. On the whole, Census may allocate population change in the wrong states sometimes, but they basically tend to be right about major changes in the degree of inequality in population growth across states. If they say growth is becoming more uneven, then it’s probably becoming more uneven.

Okay, so, the question then has to be, “How well do population estimates predict Census population?”

To do this, we will annualize growth rates to get decadal growth. For each vintage, we will their estimates up to their last year, then take the annualized growth rate for the decade to that point and extrapolate it forward. Comparing to the 2010 number, we’ll see how different each vintage’s error is for each state.

As you can see, different year-vintages can have a big impact on your expectation of 2010 population. Your expectation of 2010 population for Wyoming would have shifted by about 7% over the course of the forecast period, rising each time, and still wasn’t high enough! A similar story is true for many states: Census was continually pushing up estimates for some states, and those states, kinda sorta tended to also get similar-direction revisions in the last year. Broadly speaking, the more Census had to scramble to follow population estimates upwards via continual upward revision forecasts for 2010, the bigger the absolute value of total-decadal-revision, and last-year-revision, and the bigger the change in during-period-forecasts, the bigger the corresponding change in net revisions as well.

Perhaps an example will help. Let’s look at Wyoming. The chart below shows the various estimate-and-linear-forecast projections of Wyoming state population.

As you can see, each subsequent Wyoming population estimate implied higher and higher 2010 population… and the final intercensal tally came in higher still, including in back-years!

We can see this on the negative side too. Let’s look at Michigan.

Michigan’s population forecast went down, down, down, down, and down, and the final came in… even more down.

Now, there are exceptions. Some states get revisions down then up, some up then down, some just sort of churn… but the point is that, to the extent revisions were biased from 2000–2010, they tended to understate the extremity of change, particularly in back-years.

What does this tell us about current population estimates? Well, it suggests that the pattern of past revision is at least slightly indicative of future revision. Revisions have been much smaller this time around, which suggests future revisions may also be smaller, as data quality has improved markedly.

So…. is this useful at all?

I get a lot of questions from people asking me, “Are the Census population estimates worth paying attention to?” The answer is YES. Each progressive Census estimate is more predictive of 2020 Census population than the last one, and any are more predictive than a simple extrapolation from previous decadal growth rates. These numbers are informative; they have real content and meaning. People should use them.

But they are not the last word. I mean literally, Census goes back and changes them every year. And when we see revisions, we should pay attention: revisions tend to grow with time, and they especially tend to be back-cast once we get a final Census number.

So one straightforward way to conceptualize the Census estimates is just to make annualized growth rate extrapolations forward to 2020 for the post-2010 vintages we have thus far, and see what trend they have. Those trends tend to be exacerbated in the final intercensal numbers. I will represent this as an index: using 2010–2011 growth rates from V2011 extrapolated out to 2020, I show each new vintage 2020 forecast as a percentage of that first forecast, to show whether we should be revising our estimates of a state’s 2020 number up or down.

This gets to the real heart of the matter. Some states have consistent downward revisions in the 2020 population forecast. Alaska, for example, or Alabama. Or Illinois. Or Kansas. Or Louisiana. Or Mississippi. Or New Mexico. Or Pennsylvania. Or Vermont. Or West Virginia.

Or, the big one, at the very bottom… Puerto Rico. And that is where these questions have the biggest umph. People want to know if these estimates are reliable for Puerto Rico. I’ll go on record right here and say I expect that, either in this revision for 2017, or at least by the 2020 Census, we will see a substantial cut in Puerto Rico’s forecast 2020 population, and probably also their back-year populations. But hey, we’ll see. I can always be wrong, and will admit so publicly if I am.

Now on the other hand, we have constant-gainers, like Utah, South Carolina, Nevada, Michigan, Idaho, Florida, or Arizona. That doesn’t mean those places are fast-growing, it just means Census keeps raising its view of how fast they are growing (or, how slowly they’re shrinking.

What I want you to notice is that regional stories are weak here. Numerous warm states have rapidly-worsening 2020 outlooks, and some rather chilly places are seeing improved forecasts. This is all about state-specific stuff.

I have no rousing end to this post. If you’re not already roused to excitement by this riveting #content that I’ve shared, then this post wasn’t aimed at you anyways. But if you made it this far, then, well, I’ve said what I came to say and you’ve got my basic point: keep an eye on revisions!

I will be on vacation next week. I hope to cover the new estimates when they come out, but I may be slightly delayed. I request that the internet delay all discussion of the data until I have broken away from my Festivus grievance-airing long enough to write a post. ;-)

Check out my Podcast about the history of American migration.

If you like this post and want to see more research like it, I’d love for you to share it on Twitter or Facebook. Or, just as valuable for me, you can click the recommend button at the bottom of the page. Thanks!

Follow me on Twitter to keep up with what I’m writing and reading. Follow my Medium Collection at In a State of Migration if you want updates when I write new posts. And if you’re writing about migration too, feel free to submit a post to the collection!

I’m a native of Wilmore, Kentucky, a graduate of Transylvania University, and also the George Washington University’s Elliott School. My real job is as an economist at USDA’s Foreign Agricultural Service, where I analyze and forecast cotton market conditions. I’m married to a kickass Kentucky woman named Ruth.

DISCLAIMER: My posts are not endorsed by and do not in any way represent the opinions of the United States government or any branch, department, agency, or division of it. My writing represents exclusively my own opinions. I did not receive any financial support or remuneration from any party for this research.

--

--

Lyman Stone
In a State of Migration

Global cotton economist. Migration blogger. Proud Kentuckian. Advisor at Demographic Intelligence. Senior Contributor at The Federalist.