How State Censuses Change History

Some Preliminary Estimates from State Census Records

Lyman Stone
In a State of Migration
10 min readJun 6, 2017

--

My Twitter followers know that I’ve been spending spare moments trying to put together all the state population estimates from the 19th century derived from state censuses or tax assessment surveys. The project has proceeded at a slow-but-steady pace and, while not complete, I’m excited to share some early results. For today’s post, I’ll show the results for 6 areas: Iowa, Kansas, Illinois, the District of Columbia, New York, and Wisconsin.

First of all, what’s my data coverage for these states? Here’s a table of each state marking years for which I have estimates that I adopt:

Get the full tablehere. For some reason Datawrapper isn’t embedding tables well.

As you can see, some states were very aggressive about tracking their population. Iowa constitutionally set a rule of biennial Censuses for itself, although administrative difficulties mean that these Censuses didn’t actually fall biennially, and some extra Censuses were conducted. Kansas undertook decennial Censuses at the Federal midpoint, but also did annual tax assessment surveys that counted the population, reporting it many of these estimates from assessments in the annual reports of the state board of agriculture. The District of Columbia conducted some censuses, but sadly I haven’t been able to locate most of them; just one in 1876. However, DC did produce fairly thorough “police surveys” on an annual basis, and that data is available in some years. Illinois and New York both conducted decennial censuses at the Federal midpoint until 1865, when Illinois stopped, and 1875, when New York stopped. And finally Wisconsin conducted a series of territorial censuses, then state censuses, again, at the Federal midpoint, until at least 1895. Because I use official Census Bureau estimates after 1895, I didn’t look for 20th century records. But in a handful of cases I did run across, it seems like the Census Bureau at least sometimes adopted state Censuses as valid estimators of population.

So with that, let’s get to seeing how incorporating these datapoints changes population figures.

District of Columbia

I’ll start each area by showing my preferred population series for that area, from my absolute earliest population estimate to the present day.

Okay, so… cool, I guess. You can see DC’s sharp rise until the 1950s, and then its sharp decline since… and then the equally sharp rise again in recent years. Sidenote: DC is one of the few places where I think the “urban renaissance” story is super-duper real and accurate, and graphs like this are why. We’re talking about a really impressive turnaround in the central area of a large MSA.

But this doesn’t really show you what impact including non-Federal sources had on the estimate. So the next graph I’ll show for each area will compare an estimate based purely on Federal imputation to one that uses multiple sources, just for 1800–1900.

From 1800 to 1870, the two lines are identical, because I have no non-Federal data before 1870. But after 1870, some divergences begin to appear. Small ones at first, but then bigger; by 1887, there’s a substantial difference in the estimate of about 5,000 residents. In 1892, it’s 19,000 residents, with the police survey coming in much higher. The last police survey in 1897 gives an estimate of 277,782 inhabitants, while the 1900 Census gives 278,718. The DC report on this data raises a question of accuracy of the 1900 Census: locals knew of no reason growth should have fallen off, and they felt confident that their police enumerators did their job correctly. Their suggestion is that the police survey better counted marginal people and the non-household population, as the police better knew where to find such people. The next decade from 1900–1910 does show a quickening pace of growth for DC, which might be the result of 1900 being an undercount. However, my preference is simply to accept the numbers as they are and impute between them, rather than get embroiled in figuring out which one was correct.

Illinois

The above chart shows my preferred time series of Illinois’ population, exclusive of Native Americans in the early period. You can see the rapid growth in the 19th century and early 20th, the deployment effect of WWII, and then weaker growth performance since the 1960s.

Here you can see the effect of including Illinois state censuses. Broadly speaking, it tends to shift the growth of population a bit earlier in each decade except the 1820s. One feature that’s particularly interesting is that including the state census boosts Illinois’ population in 1865 substantially, which surprised me for reasons I’ll explain when we get to New York. Likewise, we can see the extremely rapid growth in the early 1850s thanks to canal completion, high immigration, and rapidly expanding railroad linkages.

Here’s another way to see the differences: just a graph of the raw difference (positive means the combined series is higher).

I should note, there’s some error in the Illinois census data. Finding Illinois census records was challenging; even the Illinois state archives do not possess the original published reports apparently. I had to get these estimates second-hand from newspapers, books, and backing into them from descriptions of changes between censuses for some years. Different sources often gave slightly different estimates. I have tried to pick the estimates as cited in the most credible sources or the latest citations (reflecting revisions), but even then there’s some error. The data for 1865 in particular is quite poor, with probably a 100,000–200,000 person width of the error band, suggesting the civil-war-era growth rate could be much lower (though not negative).

New York

The above chart shows the familiar path of New York state’s population: quick growth until the 1940s, though not as quick as Illinois’, then a sharp drop for WWII. Then a resumption of growth, and then a pretty remarkable dropoff in the 1970s. Growth has resumed since at a lower rate. Notably, this graph actually does visibly show the impact of including New York stat census data: that hitch from 1860–1865 is the result of the Census of the State of New York in 1865.

As you can see, state census data indicates lower population during the 1810s, the 1840s, and the 1860s, but higher in the 1890s. This especially makes sense in the 1840s, as immigration into New York accelerated dramatically in the later part of the decade. In the 1860s, we are evidently observing the effects of the civil war, both in terms of reducing immigration, deployment of troops, wartime urban unrest, improved infrastructure allowing outflows, and especially the effect of geographically separated recruitment of immigrants on suppressing fertility and accelerating their dispersion beyond New York. Then in the 1890s, the state census gives a higher estimate.

Most states for which 1865 data exists did not experience population declines. It is possible that New York’s census reflects results from earlier in the year. It is also possible New York’s unique immigrant-dependence for growth made it uniquely susceptible to wartime declines as immigrants disproportionately joined the union army and were then relocated. Detailed records from this census exist and could be used to explore this question more thoroughly.

It should be noted, the census takers of 1845 complained about a worsening ability to recruit marshals due to an administrative shift in responsibility from local governments to the state government. I do not find the measured population to be at all implausible, but I just want to be clear here about where the records I read spoke to possible measurement error.

Iowa

Iowa’s population time series is fairly typical of the plains states. After settlement there is extremely rapid growth, until the 20th century when growth is rapidly curtailed and a slow-, volatile growth rate sets in, punctuated by significant periods of decline like WWII or the 1980s.

The above chart is a bit different from the others. It shows Federal, combined, and only state series. Iowa, as I noted above, required a very large number of censuses. I am not entirely confident they are all correct or complete. If you include federal censuses in 1860, 1870, and 1880, Iowa’s population has to jump sharply in the periods leading up to them, whereas dropping the Federal censuses yields a smoother series. I was not able to find clear reasons why growth would be so much faster from 1869–1870 than from 1870–1873, for example.

Ultimately, I am concerned enough about consistent under-reporting that I’ve adopted a different combination method. Broadly speaking, I impute the rate of change between the Federal decennial censuses, which I treat as valid, based on the relative change shown by the state-only time series. In other words, I use the state data as fairly reliable estimators of within-decade relative population changes, but force the series to true up to the decennial data. The result is:

Slower growth during the civil war seems plausible, as does a speed-up afterwards, just not as severe of one as the combined data shows, yet nonetheless trued up to decennial 1870 population, unlike pure state data. There’s still some method-dependent “catch-up” growth to the Federal decennial, but I’m comfortable enough with this series.

Wisconsin

Wisconsin’s growth is kind of entertainingly linear. Literally just draw a straight line and, except for WWII, you basically have their growth pattern. So state censuses shouldn’t provoke enormous changes.

But it turns out, Wisconsin has the opposite problem Iowa has: persistent post-civil-war over estimation, whereas Iowa under estimated. But Wisconsin is conducting decennial full censuses, not biennial like Iowa, and before 1848 the censuses are territorial censuses which are conducted under Federal supervision. So my bias is to accept at least the pre-1848 censuses uncritically. Probably overestimations begin in 1855. Geometric extrapolation shows the Federal census of 1860 higher than you’d expect, but that’s because Wisconsin’s 1865 census was, well, in 1865. This extrapolation assumes the 1860–1865 growth rate was the same as the 1855–1860 growth rate, which doesn’t seem likely.

But it turns out that gap is pretty consistent: about 4%, except for 1865. So if we just assume Wisconsin overcounts by 4% but we want to preserve the mid-decade estimators, we can just lower the state estimates by a flat 4% in each year. We get much closer to the decennial figures, with the exception of the 1855–1865 period, which, again, that’s a fluke of differential wartime growth rates. Using this 4% reduced series instead to compute our combined series, the result is:

Muuuuuch better.

Kansas

The case of Kansas is a curious one. We have both tax assessor data and census data. This data shows a steep rise upon settlement, then long-run linear growth since the late 19th-century.

But hold on. There’s an enormous depopulation of Kansas in the late 1880 and early 1890s! What the heck is going on here!

Well, that’s what the assessor and census data says. It approximately matches up around Federal decennial years, it has no consistent bias, and the state reports presenting the data are able to show evidence about where and why the depopulation happened. They document crop failures, horrible winters, ghost towns, etc. The era also saw a sudden rash of “county wars” where towns actually engaged in violence trying to decide which would be the county seat. Kansas county wars claimed the lives of dozens of people, and reflected the large economic impact of having the county seat. The corrolary here is that Kansas in the 1890s didn’t have a lot of economic activity in small towns, suggesting poor economic conditions consistent with a rising push and declining pull factors. It is in the violence of 1870s and 1880s Kansas that Wyatt Earp first makes his name, with a fairly short interim in Tombstone.

As you can see, state data does undercount some, but it’s not consistent by any means. Some of the undercounts don’t reflect errors, but growth timing: early growth is always very fast, so we’d expect the extrapolation between 1855 and 1865 to understate 1860 population. The growth rates from 1860 to 1865, however, are entirely plausible, as are those from 1865 to 1870, which we would naturally expect to be faster than the previous 5 years. Sluggish growth in the early 1870s is a bit odd, but Kansas reports give reasons for this slow growth. Accelerated growth in the late 1870s are perhaps odder, but some rapid growth was a near certainty. Meanwhile, the trends in the 1880s are fairly thoroughly documented and tracked.

As such, I’m comfortable using all of the Kansas and Federal data together and interpolating between, yield the time series shown in the first graph.

State Censuses Matter

The net effect of including state censuses and other state-level data in population estimates can be fairly large. The graph below takes the absolute value of each state-level change for the 6 states in question for each year, sums it, and divides by the sum of their population based on simple Federal census extrapolation.

As you can see, the effects can be quite large.

And when I add up U.S. population from all of the sources I have available for all 50 states and DC and compare it to imputations that do not include that data (i.e. are mostly just Federal censuses), here’s the net percentage change in national population in each year:

This isn’t just states reliably over-estimating; rather much of this is just that state censuses catch the rapid growth that accompanies settlement of new land, as well as major within-decade differences in growth.

The point is, including this data can really change some of our perceptions about when and where growth happened. That’s why I want to find more of these state censuses. The major gap is southern states. I am still missing all of the census totals from Alabama, Arkansas, Colorado, DC before 1870, Florida before 1885, Georgia, Louisiana, Maine, Missouri, Nebraska, New Jersey for 1855 and 1865, South Carolina for 1869, and Michigan for 1845 and 1888. If you happen to know where I can get the total resident population figures from any of these censuses, I will give you Migration Points in exchange!

Check out my Podcast about the history of American migration.

If you like this post and want to see more research like it, I’d love for you to share it on Twitter or Facebook. Or, just as valuable for me, you can click the recommend button at the bottom of the page. Thanks!

Follow me on Twitter to keep up with what I’m writing and reading. Follow my Medium Collection at In a State of Migration if you want updates when I write new posts. And if you’re writing about migration too, feel free to submit a post to the collection!

I’m a native of Wilmore, Kentucky, a graduate of Transylvania University, and also the George Washington University’s Elliott School. My real job is as an economist at USDA’s Foreign Agricultural Service, where I analyze and forecast cotton market conditions. I’m married to a kickass Kentucky woman named Ruth.

My posts are not endorsed by and do not in any way represent the opinions of the United States government or any branch, department, agency, or division of it. My writing represents exclusively my own opinions. I did not receive any financial support or remuneration from any party for this research.

--

--

Lyman Stone
In a State of Migration

Global cotton economist. Migration blogger. Proud Kentuckian. Advisor at Demographic Intelligence. Senior Contributor at The Federalist.