The Best Way to Spot Demographic Nonsense

Freisinnige Zeitung
17 min readMay 28, 2018

--

[This is part of my series on Thomas Malthus’ “Essay on the Principle of Population,” first published in 1798. You can find an overview of all my posts here that I will keep updated: “Synopsis: What’s Wrong with the Malthusian Argument?”]

I have alluded to the main point here already in other posts. Basically, this is part of a short mini-series within the larger series that clarifies a few issues that in my experience are often muddied. A first post in this regard was: “A Short Explanation of Population Momentum.”

One reason I do this is as a reference, so I don’t have to write the same things over and over again. And then I think it is important to understand some basic concepts if you want to write about demographics. That should be a moot point. And I wish the many journalists, writers, and even some scientists who feel called upon to lecture others about population dynamics would do some due diligence first. Yet, that is often unfortunately not the case.

In my view, the best way to spot charlatans writing on demographics is this:

Watch out for those who either only argue with birth and death rates or who confound fertility and mortality with birth rates and death rates. You can often find both symptoms also at the same time. Don’t overuse this rule, though. There are some situations where arguing with birth and death rates makes sense. It can also be the only thing you have and that’s what you have to work with. I would not hold that against anybody.

But then I would still expect an author to try and back out fertility and mortality to argue with the relevant concepts and not with birth and death rates. Or at least to reflect on how those might be inadequate and how much of an error you introduce in this way. When someone is completely oblivious to this, though, you should be wary. The symptom is not a conclusive proof, but in my experience the rule really helps you spot the charlatans most of the time.

Before I get to the main point, let me try to explain the concepts in as simple a manner as possible. This is somewhat technical, but not too difficult. All it takes is to concentrate a little to understand the subtleties. After I have done that, I will demonstrate the inherent pitfalls with a few examples. If you want to argue with birth and death rates as a crude approximation and use them as a first stab, that’s fine. I would do that, too. But once you have grasped the potential problems, you would not leave it at that.

— — —

Birth rates and death rates are easy to understand and also easy to calculate. My hunch is that especially the latter point is the reason why they are so overused. That is especially true with early writers on demographics. Thomas Malthus is a prime example of the relevant confusions. But then most authors at the time did not have this clear although some had far more insight than others. However, that is no longer an excuse in our times.

The birth rate is simply how many births there are over a time period, usually a year, and for a reference population. The death rate is defined in parallel as the number of deaths over a time period and for a reference population. The nice part here is that if you subtract the death rate from the birth rate, then you get by how many people the reference population grows over the time period. Usually, it is set to 1,000 people. Both rates are in the double digits, rarely in the single digits. If you took 100 people instead, the rates would be ten times smaller and percentages.

There is nothing wrong with all this, but there is a tricky point here. Both birth and death rates do not only depend on the propensity to have children or to die, but also on the reference population. What you are after is actually the former: how likely are births and deaths. But then the structure of the population also plays a role. Take as a stark example this: A birth rate of 10 or 20 is not astonishing. However, if the reference population is above age 60 or below age 10, you should perhaps be surprised.

The problem is not so serious if you are aware of what the reference population is. But then the interesting question is how the propensity to have children or to die varies over time. If you look at birth and death rates, then you will see some movements, but those may be hard to interpret if the structure of the population also changes in parallel. In this case, you might have little insight into what is happening.

Hence if you ignore this, you may attribute developments to a changing propensity to have children or to die while in reality nothing changes here at all or even in the opposite direction. It could all be only about the population structure. As I will demonstrate with an example below, you can get massive movements for the death rate while the propensity to die remains completely fixed.

Fertility and mortality are the relevant concepts here to capture the propensity to have children or to die. To get rid of the problem with varying population structures you have to concentrate on individuals, maybe individuals of a certain type. Unfortunately, it is not as easy to measure the likelihood that they will have children or die. All you can observe is when they indeed have children and they indeed die, not when they just came close and there was only a probability it might happen.

One way around this is to look at larger groups that you hope are homogeneous. If that is so, you can look at what they do on average, and that might be an estimate for the propensities. So you could perhaps look at all women of age 30 and how many of them have a child over the next year. There is still a certain dependence on the population structure because the group of women of age 30 might change in some way that is not apparent. But then this does not depend on whether there are more or fewer women of age 30 in the population this year than last year as with birth rates.

Ideally, you would have all propensities for all age groups, the cohorts for those born in the same year perhaps, maybe even separately for sexes or other criteria. If you have all these data (BTW, a plural) and you also know the population structure, you can recover the birth rate. You just have to apply fertility to the cohorts. But you now have the information for both parts separately and then you can put them together. Hence you can pin down what is going for each of them while birth rates lump everything together.

The case for mortality and death rates is entirely parallel. Mortality would tell you how likely it is that, for example, men of age 50 will die over the next year. And then you would also like to know this for all cohorts, and also for women. Or you might even want to split that up further for other criteria. Once you have these data and the population structure, you can calculate the death rate from them, just apply mortality to the cohorts. But it again does not work the other way around because everything ends up in the same quantity, death rates, and it is no longer possible to understand where an effect came from.

— — —

What is awkward here is that you basically need a whole function for all ages and the propensities to have children and to die. It would be convenient to also have some measure that condenses the information into one number like with birth and death rates. That is easier to interpret. But then that also has the disadvantage that you lose all the information for the different cohorts.

One way to get close to such a goal is to look at birth rates only for women of fertile age. That casts the rest of the population structure out, which is irrelevant here. Shifts within this group might be less pronounced, so as a first approximation this is somewhat better than birth rates for the whole population. Still, there can be shifts also for the female population structure of fertile age. That’s why this is at best a makeshift.

Another way, which is popular, are total fertility rates. You start with the propensities to have children and to die for all cohorts. Then you set up an ideal population who are born at age zero and progress to later ages via mortality, and so this ideal population shrinks to zero over time. If you then also have the propensity to have children, fertility, you can calculate how many children this ideal population would have. On a per capita basis, that’s the average number of children over a whole life with these inputs for mortality and fertility. Usually, this is quotes per woman, ie. roughly twice as high as per capita.

All in all, total fertility rates come pretty close. But they also have their pitfalls. For example, mortality and fertility are commonly estimated in a cross section, ie. at the same point in time. However, that means you have different cohorts that cannot be equated. Yet, in the ideal population you treat them as all the same. Suppose women of age 20 have different plans for their lives than women of age 30 now. Then the assumption that they will have as many children in ten years may be wrong. The same goes for mortality. People who are now 30 might face a different propensity to die in 20 years than people who are now 50.

I will look into this more deeply in another post on the “tempo effect.” That arises when women postpone having children over time or have them earlier. To take a stark example: Those who are now 30 years or older might have had all their children before that age and have none this year. And those who are now younger could postpone all their births until after age 30. So noone would have children, and the total fertility rate would work out to zero.

Yet, also the younger women might have as many children eventually, and so a measured total fertility of zero is way off from what you actually want to track. Of course, in reality such effects are smaller. Still they can result in serious distortions. If age of mothers at birth rises over time, total fertility rates appear too low, if it falls, they come across as too high. I will leave a more thorough analysis to another post, but maybe you already sense the problem.

Another way to obtain a summary measure for fertility is the number of children that women in a cohort eventually really have. That gets around timing issues and is what you are really interested in. But then the problem is that you know this in retrospect with a long lag. Only past perhaps age 45 can you get a grip on the measure, so you are always behind the time. When a cohort is close to the end of their fertile age, it is at least possible to get a good estimate of what might be the case eventually. But for younger cohorts where everything is still up in the air, that is not possible or you have to introduce some strong assumptions like with total fertility rates.

Unfortunately, this is all much more complicated than with birth rates, and the same is true for mortality and death rates. You might know how likely it is for someone who is 50 years old now to die over the next year. But it is not clear whether those now 30 years old will have the same propensity to die in 20 years. That might change.

Again, you can simply introduce a strong assumption that there won’t be a change, and then you can calculate “life expectancy at birth.” However, if there are changes over time that might not reflect accurately how long those will live who are born now. At best you get an approximation under certain assumptions, which might turn out to be off. Many people tend to gloss over these problems, and often this is really innocuous. But you should at least be aware of the potential pitfalls here when you want to lecture about demographics.

— — —

After these theoretical discussions, let me now show you some examples as a warning how off an analysis with birth rates and death rates can be. In my latest posts, I set up a model to explain population dynamics for Japan (see here and here). I make no claim that this is a correct description and you have all the right to be skeptical about my forecasts. However, that’s not what I need at the moment. Please just view this only as a model for some population that has realistic mortality as in our times and also a moderately realistic distribution for when people have children.

The dynamics for the population look like this:

You have strong population growth initially, but that then runs out. The population next shrinks down from a peak, but only to some extent. Afterwards there are further such oscillations with a decaying amplitude, and the size of the population converges to a fixed level. As for the first part until our times, this is quite realistic for Japan, as for the rest this is only a forecast it could be wrong. And that comes from a simulation in the background for a virtual population of about 40,000 people with the population dynamics I suspect also for Japan (at least as a first stab).

The advantage of a simulation here is that I have full insight into what is going on. I change fertility in a systematic way, which means I shift fertility up and down depending on population size. This is an explicit input that I know about and do not have to infer from data for an actual population. Since I simulate a history for centuries, I can also wait out realized fertility and compare it to the fertility I plug in. And I can also calculate birth rates. While it is debatable whether I get a good forecast for Japan in this way, this is certainly a realistic model for a population that goes through cycles of high and low fertility.

Here is what I have over almost 300 years for fertility, realized fertility, and birth rates. I cut the first twenty years away where my model leads to very high fertility because I suddenly swith the dynamics on in 1935.

The red curve is the fertility that I plug into the model. That’s how likely people are at a time to have children. Since there is some mortality until fertile age, I correct it down for that, ie. I divide by the replacement level where the population is stable, which is 2.13 here. You need slightly more than two children per woman because some women will not live long enough to have children. The figure is not perfect, but close to what it is for actual populations, namely about 2.1.

The blue line is for realized fertility. I have to shift it to an appropriate point because peak fertility in my model is at age 30. That’s why I plot realized fertility for the cohort that was born 30 years before. So the value for 2000 is for the cohort of 1970. As you can see, the blue line is a pretty good approximation for the red line. The wiggles are only there because the population is rather small, about 40,000 people, and there is randomness in the simulation. If I took a larger population, that would be less pronounced.

There is still a slight difference that is hard to spot here. Realized fertility is an average over a whole lifetime, and hence also an average over fertility in individual years. You can see that the blue line runs a little after the red line. The scale is large, so this is not completely negligible if you look at it closely. Yet, realized fertility comes very close to what you would want to measure here. (See for a more detailed explanation in the comments.)

Now, the green line is the birth rate. Since this is on a different scale (per 1,000 people), I have scaled it down to make it roughly the same size as fertility. Technically, I multiply with 2.8/10. Birth rates almost always follow fertility, and the distance is by no means negligible, often at least ten or even fifteen years. The curve also has a slightly different shape, the tops and bottoms tend to be sometimes broader, but sometimes there is also a sharp turnaround that does not correspond to anything for fertility.

This is all spurious. But I assure you that you will find many commentators who write about how the birth rate has now hit a low, and comment on it in real time like it were some sports event. However, if you look at the graph that might mean they are actually talking about a low for fertility that occurred a decade or more ago. Any explanations that relate the observation to events now would then be just wrong. Yet, if you don’t understand this you can write about it with aplomb.

Surely, birth rates are not completely useless if you keep their limitations in mind. They can even give you a hint about what is happening with fertility. The two quantities really do have a connection. And if you only want to draw broad conclusions, that is okay. Yet, if people start to interpret the finer movements here, they might be talking about literally nothing.

— — —

Let’s now look not only at birth rates, but also at death rates for this model where things become far worse:

I work on the original scale here, which is per thousand population. Hence all figures are one order of magnitude larger. The blue line is for the birth rate, and the green line for the death rate (time now runs starting at zero, sorry). In addition, I have also plotted the difference between the two as the red line. That is by how much the population grows or shrinks (the negative values). If you divided it by 10, you would have percentages.

Note that in my model, I always have the same mortality by construction. Nothing ever changes there. But death rates go through wild swings nonetheless. This is only so because of a changing population structure. It is hence completely wrong to talk about changes for death rates as if they showed anything about the propensity to die. There is literally no change for that. But then you will see lots of examples where people are unaware of this and naively assume that what they see can only come from changes in mortality. To add insult to offence, they will then often even call the death rate mortality.

In reality, even in earlier times, it is not unrealistic to assume that changes for mortality over time were at most slow and moderate. Hence my assumption of fixed mortality is not a strange premise, but actually close to reality, even more so in modern times with very low mortality. Mostly what you see for death rates is the changing population structure. But that is not what you would expect. Instead your idea might be that you keep track of mortality.

So the least someone would have to do here is exclude that they commenting on a pure artifact that is driven by the population structure and not mortality. Actually, oscillatory movements for population as in this case with long phases of growth and then shrinkage are very common for actual populations, and then you will also find movements like in this chart for death rates. But those then might mean nothing. If you see someone who begins to tell a story how something happened after year 40 in the chart that then led to much higher mortality, they only tell you that they are stupid or more benevolently: totally ignorant.

— — —

If you think, this is just pedantic and few people ever get this wrong, I am afraid you are in for a rude shock. I could show you an endless list of examples where authors get all this totally wrong and are even unaware there might be a problem. That is not only so for journalists and writers. I also know many examples where, for example, economic historians fail in this regard.

An example would be Gregory Clark, but by no means the only one. In his “A Farewell to Alms,” he sets his basic model up with the assumption that a population pursues a birth rate. That cannot be so. No population could ever do this because it would have to keep track of its population structure all the time. To achieve a fixed birth rate, it would have to correct its fertility on a continual basis and in a nonsensical way to achieve a result that Clark thinks is the natural first assumption.

And then he explains what happens with death rates that go up to keep the population fixed. Death rates depend on the population structure, though. And what happens might only be a reflection of how the population changes, and not that of a change for mortality. Yet, he always treats it as if rising death rates could only result from higher mortality.

And of course, when he interprets historical data, he falls into all the traps that are set up and relates shifts to events at the time, both for birth and death rates, and not to when they really happened. To make the confusion perfect, birth rates are also frequently called fertility, and death rates mortality. If you have understood the above points, that should make you cringe like it makes me cringe.

But then the populations he studies are of the type above. Here is the difference of the birth and the death rate for England from 1500 to our times. The blue line is an attempt to smooth that a little because the underlying data are perhaps calculated from a population just as small as in my simulation until exhaustive censuses begin around 1800:

As you can see, you have the same type of long-run oscillations as in my model. Since there is mostly population growth, the values are more on the positive side while my assumptions above lead to swings around zero because the population stabilizes. The movements are longer, though, and in my view driven by other things. Yet, you should expect an ever-changing population structure here, too. In that case, birth rates may lag fertility, and death rates might be seriously misleading.

— — —

I will leave it at that. I am sure you will find plenty of examples that look immediately dubious when you heed my advice. As noted above, you should not dismiss conclusions only on that basis. But it is not wrong to put your thinking cap on and reflect critically on what is going on. If someone takes birth and death rates as a starting-point, but then goes over to fertility and mortality, great. But if they fail to do that, and are maybe even confused about the concepts themselves, you should be wary.

In my next post of this mini-series, I will explain the “tempo effect.” I have already sketched where this is going. The errors here are not as large as when someone does not understand population momentum or fertility/mortality versus birth/death rates. But then especially because of its subtleness, this is a very good indicator whether someone is a charlatan, though maybe with an advanced degree because you first have to understand fertility to some extent. Yet, in my experience this does not preclude that someone can also be confused about that.

--

--