What European Population History Can Tell Us About Puerto Rico

Let’s Be Honest: Puerto Rico is just a hook I use to make you read arcane demographics minutiae

Lyman Stone
In a State of Migration

--

UPDATE: I have updated some of the charts and data to reflect better population estimates for several countries, most notably Spain across all years and Italy prior to 1880.

I recently wrote a blog post about Puerto Rico in which I suggested that the ongoing population collapse there could eventually become comparable in scale to the post-famine depopulation of Ireland. But what struck me about this little project was how little I really knew about European population trends. So I started digging into the topic more and more. I’ve previously written about Nordic population trends, of course, but I found myself very curious about what other stories were out there.

This post, then, is the current state of my notes on Western European population history since 1700, and what the findings might suggest about Puerto Rico. Why do I keep fixating on Puerto Rico? Simple: it is the single most dramatic demographic story in America together, along with the oil fields of North Dakota and the recovery of the District of Columbia. For anyone who cares about population economics, the case of Puerto Rico is phenomenally interesting, and worth viewing through every lens we can focus on it.

Let me explain those limiting factors on my dataset. I look at just Western Europe, which I define as you can see below.

This focus on Western Europe is not because these areas are bigger, or better, or more important. It is because I am trying to identify population trends for areas with a fixed geography. That is, I’m not asking, “What was the population of France in 1863 in its 1863 borders?” I’m asking “How many people lived in France’s 2016 borders in 1863?” This is a very challenging question even for countries where border changes have been comparatively minimal, like France or Portugal. For a country where border changes have been quite extensive, like Germany, it is extremely challenging and requires a fair amount of hand-waving. And for countries where border changes have been extreme and frequent, like Poland, Russia, or the Balkans/former Austria-Hungary, this task is impossible given the amount of effort, language abilities, and primary source access I have. So because of basic data constraints involving border changes, I restrict to western Europe.

I restrict to the post-1700 period because there’s no good data before 1700 for almost any country; and the truth is that I don’t get good data in every country until 1900, and, even then, border changes make interpreting the data difficult. But I found that 1700 as a starting point was the earliest could go. The point, however, is that by numbers around 1700 are less reliable than around 1750, which are less reliable than around 1850, and so on. After 1950, I am especially confident in my data.

A note about how I constructed the data may be in order. Every country is its own beast. I of course scoured the interwebs for other peoples’ estimates, Census data, and other data sources. I can’t give you a source list that means anything because each country’s sources vary, and I often used sources that were themselves compilations of other sources. I assembled multiple different time series for each country representing each identifiable source, and then created a single harmonious time series based on selecting and weighting between these various time series. Where I was able to verify the validity of a source, I weight it much more heavily. Where a source seems unverifiable or speculative, it is weighted less heavily. The point is, my final time series for a given country or region represents a composite of many sources and estimates. It represents a kind of “central tendency” among what estimates are out there.

Why do I need to do this kind of kludgy imputation and combination? Simple: my goal is to have a sense of the comparative population of given regions in given years. But different countries conduct censuses in different years, and thus I need to extrapolate back into common years. Rather than do this for an arbitrary basket of comparison years, I did it for all countries and all years, allowing me to make an “All western Europe” population estimate for each year. This estimate of course contains the sum of all underlying error. But some errors will offset: for example, if I have misattributed Dutch territory to Belgium in some year, this will make both individual series incorrect, but my combined series will be undisturbed. Simple errors in total estimates obviously can’t be corrected so easily, but my hope is that errors across countries will be uncorrelated, and my belief is that I have done a reasonably good job of selecting the best available estimates and extrapolating between them reasonably judiciously.

For example, where I know about specific population incidents that occur between data-points, like wars or epidemics, I have created my own shocks to reflect that. For this reason, not all interpolations between points are straight lines, because sometimes I have a good reason to think it really wasn’t a straight line and that I can reasonably approximate the true shape of the trend. Thankfully, some between-point shocks for some countries occur at times where other countries have consistent data, and if shocks are shared, I can get an idea of what happened. This is especially the case for wars that occurred within two countries’ territories (though here it’s notable that if a war occurs all in one country’s territory, the non-hosting belligerent may show large deployment declines while the hosting-belligerent may show larger loss of population even as the presence of larger armies causes wartime population to be inflated).

What these caveats mean is that you shouldn’t use my data as an important variable in other regressions. This is not a valuable independent or dependent variable for any kind of statistical exercise, because I’ve anticipated likely independent variables and forced them to have an effect.

The only purpose for which my data is suited is comparing population levels and growth rates across time and territory.

These are descriptive statistics, nothing more. And they are flawed: if you’re reading this and you’re an expert on the population history of one of these countries and I got it wrong, please tell me. The data is all available for download. Please critique it! I only request that your critique come in the form of providing an alternative estimate, ideally with a link to a source. And for the record, if it’s on Wikipedia, Populstat, Tacitus, OECD, World Bank, Eurostat, a contemporary governmental statistical agency, or the first page or two of google results (web or scholar), then I’ve probably already seen it. Don’t leave comments saying I ignored that awesome data at Populstat; I am not ignorant of it, I just have found that it is very frequently totally wrong.

So, to start with, let’s do headline numbers.

Above I show W. Europe population vs. US population. To clarify, that’s population of all territory that would eventually become the USA, not just the historic USA at the time. The gray line is the gap. As you can see, Western Europe’s population advantage was stable or rising until the World Wars, which knocked them down a fair bit, and then the gap plummeted as the U.S. began allowing in large numbers of immigrants. I should note that in very recent years, as immigration into Western Europe has risen, the gap between the U.S. and Western Europe has been flatter.

We can also look at growth rates.

The above chart shows the annualized 20-year growth rate for the US and W. Europe. I use a long period because the nature of my interpolations is that you get some wonky individual-year growth rates, but they get smoothed out over time. The important thing to note here is that the U.S. has consistently had a faster rate of growth, but the exact gap has varied. From the early 1700s to the 1950s, the gap shrank continuously as immigration represented a declining share of growth. But with the return of immigration, and a faster European fertility collapse, the growth-gap has stabilized.

I should note that the annual data suggest that since 1700, European population growth has exceeded American in just two years: 1944 and 1945. Before that sounds totally insane, keep in mind U.S. population plummeted as we deployed lots of soldiers to Europe (emigration!), and those soldiers are counted as resident in Europe. If we take the whole war-and-recovery period together from 1938–1948, all Western European population grows 3.6% in total, versus 12.6% in the U.S.

So that’s that. Let’s get to the fun part: individual countries!

Sadly, Datawrapper doesn’t have a tool for the most intuitive way to represent this data: line graphs where you can select what country or countries you want to display. I tried making a tool where you’d select a country and it’d spit out a bar graph, but it was clunky and ugly and didn’t produce the result I wanted. So… here’s a confusing line graph:

Two big caveats here: The UK and the Republic of Ireland have been changed into Great Britain and All Ireland, because that’s how I was able to get the historic data. Whine at me about it later. I do have UK vs. Republic of Ireland from the 1840s on, but back to 1700 I needed to do the whole island as a unit.

Second, please forgive me for Germany. I did my best, but there are so many border changes. I think the estimates I’ve made correctly reflect population within Germany’s correct borders, but I welcome criticism on that front.

Now that we have this data, we can do some fun stuff. For example, we can compare population growth in Western European countries from 1840–1900 to U.S. states in the same period. Oh and, here’s a fun thing: I actually built sub-national population series for the larger Western European countries too, so my 18 entities in the spreadsheet above morph into 32 entities for this comparison.

The result here is pretty neat. Most US places (red) have faster growth than most Western European places (blue) in this period… but there are some exceptions! Greater Paris (defined as the largest definition of Paris in use today, so many of these residents didn’t think of themselves as residing in Paris) posted faster growth than many U.S. states, especially slow-growing southern states. Meanwhile, Vermont exhibited some of the slowest growth of any region, coming in somewhere between non-Paris France and Northwest Germany.

We can also look at interwar growth.

Here, U.S. growth dominance is far less in evidence. And, interestingly, Wales is the slowest-growing place. The fastest growing place in western Europe is not actually in western Europe: it’s Greenland! I include Greenland in western Europe due to its dies to Denmark and because I freakin’ can. It’s so small it doesn’t impact any headline results if I drop it.

Just so you can get an idea of the types of things this dataset can look at here’s Florida and Wales from 1900 to 1950:

Now, look, I’m not trying to say anything at all about Wales and Florida. This is just an arbitrary comparison. But I find that when I think about history, it helps me to have benchmarks, comparisons, to have some kind of sense of scale. During World War I, Wales was a larger population base for recruitment and industry than the state of Florida was. By World War II, they were neck and neck.

Finally, let’s get to the interesting thing that really motivates all this: population decline! I’m looking for periods where population declines substantially, and ideally not just because of wartime deployment. For each instance, I’ve tracked population from the pre-decline peak to the year where population rose above the previous peak, or else to the most recent year if the prior peak has not been reached. The chart below is very noisy and, note, there’s a region included in it that I didn’t include in the charts above: Puerto Rico.

Now, okay, that’s a super busy chart. But, again, if you download it, all the lines are labelled. But let me restrict to just some interesting cases of decline, so it’s more readable.

I’ve truncated Ireland’s data to let us zoom in a bit. Puerto Rico is the bolder pink line that ends in year 17. As you will notice, Puerto Rico has a fairly unique problem: its population decline is accelerating, when most historic declines in my sample regions have slowed by 17 years after peak. Puerto Rico’s population decline is on pace to be more severe than Oklahoma’s Dust Bowl if current trends continue into 2018, and if they continue to 2020, it will be the most serious state-level-area population collapse in the U.S. other than the District of Columbia (thin black line that goes below -35).

All of this to say, comparing Puerto Rico to general declines gives more reason for pessimism.

But there’s a catch! Notice the story of Corsica, the orangish line with severe population decline and then a rebound. Corsica, like Puerto Rico, is an island associated with a much larger mainland country, with which it has freedom of movement. Corsica, like Puerto Rico, has lower income than the mainland, it has a greater degree of autonomy than other regional units, it has linguistic differences, it has an independence movement, etc. Corsica is in fact a strikingly good comparison to Puerto Rico.

But that brings to mind other possible comparisons! What about Sicily, Sardinia, Ireland, the Canaries, and the Balearics? What about the Foroes, Greenland, Iceland? Well, what about them?

Population estimates pre-1800 are less reliable, particularly pre 1865, and especially Hawaii pre-1830. Estimates for Hawaii pre-1800 are extremely hotly debated and I have selected a reasonable midrange, but am in fact agnostic about the true level of population pre-contact.

There’s the population of all my island areas! Nifty, eh? But the scales are really different, so this doesn’t tell us that much. So let’s do a different thing: let’s look at each island area’s population as a percentage of its prior maximum. So, for example, in 1780, that means a population will be represented as (1780 Population / Max pre-1780 Population) minus 1 (so negatives mean lower than prior max, positives higher). Here’s what the results look like:

So Iceland early on often experiences sharp declines, Hawaii post-contact loses tons of people, the Irish depopulation is massive, and Corsica is really severe too. After Hawaii, Ireland, Corsica, and Iceland comes Puerto Rico. If you want more nitty-gritty than that, download the data and have fun.

So here’s the question we want to ask. What is Puerto Rican depopulation like? Is it like Irish depopulation, with a short-term obvious cause, but actually a very long-run problem of basic demographic imbalance that the proximate cause simply hastened? Or is it like Corsica, where a few shocks could cause a massive flood back in? Well, it’s worth investigating what happened in Corsica. First, population growth was already low-ish before WWI. But then, of 300,000 Corsicans, 50,000 enrolled in the military for WWI, which makes it one of the highest recruitment rates in Europe. Somewhere between 12,000 and 26,000 of them died. Corsica housed prisoners of war as well, which doesn’t exact drive up local amenity values, but did contribute to population in a direct, mechanistic sense. Many Corsicans left and migrated to French colonies, especially after 1936…but then WWII brought an occupation by 85,000 Italians, 12,000 Germans, and rumblings of annexation.

Because many Corsicans had migrated to French colonial holdings and to Algeria, they featured prominently in colonial administration, and tended to be opposed to any colonial withdrawal. Corsica joined Algerian colonists in stringently opposing withdrawal from Algeria… and when the colonies began to be closed up, bam a wave of population increase hit Corsica, with the island being a major resettlement site for former French colonists. This has of course had its own controversial components within Corsica, and, through various mechanisms, has incited the push for Corsican autonomy or even independence.

UPDATE: I have gotten lots of feedback about Corsican population trends. Several commenters suggest that population peaked before WWI. However, this would require me to disregard the Census data that official French sources seem to hold as authoritative. Unless I can be provided a fairly complete alternative time series, I am very hesitant to break with Census data. I’ve also been asked if I have any evidence for my claims about the relevance of resettled French colonists: yes! The Census of 1962 identifies at least 17,000 such resettled individuals, and the continuing growth of this population through 1975 provoked the “Aleria incident,” a case of Corsican violent resistance to the French policy of resettled pieds noir in Corsica. Some commenters have asked if I have data showing migration drove Corsica’s population trends. As I’m not an expert in French historic Census data, I don’t know what migration data exists before 1975 for French regions. But from 1975–2015 I have fairly complete vital statistics for Corsica which show that mortality exceeded French norms and natality was low, indicating that growth must have come from net migration. Combined with the evidence of the presence of resettled Algerians, this makes my story all the more plausible. It should be noted that, depending on sources used, somewhere between 300,000 and 1.2 million Francophone people departed Algeria after independence, so if even a small share went to Corsica, it could drive large population changes.

Is that Puerto Rico’s experience?

Well, not quite. Puerto Rico actually has a fundamental demographic imbalance more like Ireland: mortality is rising, fertility falling. Puerto Rico’s population is emigrating, like Corsica’s, to other holdings of the parent country… but a sudden expulsion of Puerto Ricans, like what French colonists experienced in the 1960s, is unimaginable.

Thus, it seems more likely that Puerto Rico will follow an Irish path of decline-and-stagnation rather than a Corsican-style decline-and-rebound.

The one major factor supporting the idea of a Corsican-style rebound is this idea of administrative similarity. We can, however, compare Puerto Rican population vs. its “Federal” migration partner with Corsica’s or Ireland’s. For Puerto Rico, I compare to the USA (including Puerto Rico). For Corsica, I compare to Metropolitan France. For Ireland, I compare to both the British Isles, and to the British Isles + USA.

This chart shows that Ireland was a very large share of migration-area population. This means the outside-of-Ireland population which could easily return to Ireland would have had to exhibit a very high rate of migration to offset Ireland’s decline; in other words, it would have taken a very big shock to reverse Ireland’s decline because the decline was large relative to potential migrants.

Puerto Rico’s population share is much larger than Corsica’s, of course, and here the ratio probably matters more than the absolute gap; since Puerto Rico is double the share of its area Corsica is of its, the migration snap-back necessary for Puerto Rico’s decline to be offset is correspondingly double what was necessary for Corsica. But Ireland was more than 20 times as large, relatively speaking, as Puerto Rico. In other words, in terms of the raw demographic possibility of a rebound-wave of migration, Puerto Rico looks a lot more like Corsica than Ireland, though it still would need a shock of similar relative magnitude as thousands of people being deported with relocation partly paid for my the French government.

Conclusion

Puerto Rico’s situation is bad, and unlikely to face a sudden reversal. This should be kept in mind as we think about how policymakers can plan for Puerto Rico’s future. This is especially unfortunate given that the absolute scale of Puerto Rico’s problem is sufficiently small that it could be offset by reasonably strong migration from the mainland. Puerto Rico is not that big. A substantial return migration wave, a sudden burst of retiree enthusiasm, a change in tax or legal status, any of these, if it were truly a meaningful change in costs and preferences, could find a large pool of migrants, certainly large enough to offset decline. This was not the case in Ireland; there was no plausible place with enough people where a feasible migration rate could have offset Irish decline. Corsica’s decline was offset by exactly such an outside population (returning French colonists). Thus, we should say that although Puerto Rico’s demographic recovery is extremely unlikely, it is not outside the realm of what has been historically possible.

Check out my Podcast about the history of American migration.

If you like this post and want to see more research like it, I’d love for you to share it on Twitter or Facebook. Or, just as valuable for me, you can click the recommend button at the bottom of the page. Thanks!

Follow me on Twitter to keep up with what I’m writing and reading. Follow my Medium Collection at In a State of Migration if you want updates when I write new posts. And if you’re writing about migration too, feel free to submit a post to the collection!

I’m a native of Wilmore, Kentucky, a graduate of Transylvania University, and also the George Washington University’s Elliott School. My real job is as an economist at USDA’s Foreign Agricultural Service, where I analyze and forecast cotton market conditions. I’m married to a kickass Kentucky woman named Ruth.

My posts are not endorsed by and do not in any way represent the opinions of the United States government or any branch, department, agency, or division of it. My writing represents exclusively my own opinions. I did not receive any financial support or remuneration from any party for this research.

--

--

Lyman Stone
In a State of Migration

Global cotton economist. Migration blogger. Proud Kentuckian. Advisor at Demographic Intelligence. Senior Contributor at The Federalist.