Micro-Foundations for Stabilizing a Population

Freisinnige Zeitung
25 min readMay 31, 2018

--

[This is part of my series on Thomas Malthus’ “Essay on the Principle of Population,” first published in 1798. You can find an overview of all my posts here that I will keep updated: “Synopsis: What’s Wrong with the Malthusian Argument?”]

In my previous posts in this series, I have just made the assumption that a population can stabilize at a certain target size. Another premise was that the population is also able to keep that target size fixed over time. That is by no means obvious. Your parents did not tell you at some point: “Ah, lest we forget: We love this one specific target size. Be sure to pursue it, too!” And even if they did, what would that mean for you?

There are lots of other things that may seem strange at first glance. How and why would anyone try to stabilize the population at some target size? The result is not even directly observable unless everybody does a census all the time or at least reads up on the results when others do. Or how would you do this on an individual level even if you knew the target size? Children are not fungible, you cannot have 0.7 or 3.2 if some formula says so.

In this post, I would like to present a moderately realistic model for how it can be done. As you will see, there have to be some inputs, but those are not hard to obtain. Also the computations are anything but complicated. My point here is that a population does not have to do all this on a conscious level, the process might well be implemented “in hardware.” And it is so easy that also other species could do it. At no point are higher capabilities required that are specific for humans.

— — —

Let me explain the setup first, which is somewhat idealized, but captures a lot of what probably applies also for real populations. The underlying mechanism is robust, and should be so. Hence the outcomes would be similar even if you played around with the assumptions. I have only done this to some extent, actually I obtained the results only today. Hence this is only a very preliminary analysis. But then my claim looks pretty safe that it works although I have to admit that I don’t know why so well.

Here are now the rules of engagement for the population in my simulation:

A population consists of men and women. I assume that both have the same mortality, ie. propensity to die at a certain age, which I set to values that are roughly as in modern times: 5% mortality until age 30, mortality speeds up after age 50, and then even more so after age 80. Everybody is dead after 100 years. I also assume that the sex ratio at birth is 50/50 for men and women. In reality, slightly more boys are born. However, they suffer also from higher mortality, which equalizes this to some extent.

The population lives on a square, which has a side length of 1, ie. coordinates fall into a range from 0 to 1 in two dimensions. I place people at random on the square at first. Later when children are born, they are based where their parents live. The only change here is when two people marry. I assume that the woman moves in with the man. That is inessential, it would work just the same if it were the other way around or at random.

People get to marry each other in this way: I start with a woman, but could also start with a man, which is unimportant. She has to be at least 18 years old to marry. What she does in a year is look at the set of potential husbands. Those are all men who live at most a certain distance away (I set this to 0.1).

Then she filters those out in a certain age range. There is a rule for age differences that are considered socially acceptable, which works quite well across many societies as far as I know and is also intuitive in my view. A partner should be no older than the age of the other partner minus seven years and then times two. So if a woman is 18 years old, that would yield an upper bound of (18–7)*2 = 22 years. If she is 30 years old, the range is much broader, and the upper bound works out to (30–7)*2 = 46 years.

I then filter all men out who are outside a range from the age of the woman to this upper bound. Of course, in reality you also have couples with larger differentials, though I guess rarely, and also couples where the woman is older than the man. Yet, all but one country in the world — Nauru being the exception — have an age at first marriage for men that is higher than for women (see here for data).

Next I exclude all close relatives of the woman: She cannot marry her father. But since I assume that widowers do not remarry that is not possible anyway. And then the rule for the age range precludes this mostly, too. She cannot marry her sons either. But then that is not possible because they are younger and she will not remarry even when she is widowed.

But then I also exclude the next layer: She cannot marry her grandfather where the rule for the age range already sees to that, and excluding remarriage makes it impossible, too. But then here is where it becomes relevant: She cannot marry a brother, which would be otherwise possible. My simulation is flexible enough to go further. I could also exclude uncles and cousins or even farther relatives. However, all this only makes it harder to find a spouse, which should not play a major role for the results.

Finally, I select one of the potential husbands from this set at random, and then with a certain probability, set to 50% below, they get married. If it does not work out this year, then maybe next year, and so forth. A probability of 50% means people get married early, though, as you will see. As noted, this part is rather inessential for the results, I get similar outcomes when I fiddle with the assumptions: a lower probability to marry someone in a year, exclusion of farther relatives, and a search with a smaller distance. It then only becomes harder to find a spouse. People will marry later, and more will be unmarried.

— — —

Now, the crucial part is fertility. In my previous simulations, I modeled that as only dependent on age. But that makes no sense here or also in reality. It is not so that families have children just by the age of their parents, like by some mysterious mechanism a fixed share of all women 30 years old get pregnant in a year.

There are also many other potential problems: Fertility for the population depends on the share of people who are married because I exclude children out of wedlock in my simulation. If a couple marries later, it cannot work that they have children like people who married earlier. And then since a variable share of the population are not married, those who are married also have to make up for them with higher fertility to keep the population stable although they have no idea what the share of unmarried people is. And finally, it is also not possible for people to adjust their fertility to the size of the whole population because they don’t know what it is. Awkwardly, I also had the replacement level in my formula, which is not known and may change with mortality.

For all these reasons, it might seem impossible to achieve stabilization, but here is how to get around all this:

The first thing is to model the fertility behavior in a more sensible way. It should not be so that children are born randomly when their mother has some age. Instead she and her husband have a desired family size that they want to achieve. I will explain in a moment where that comes from. But let’s assume it as given for now.

What I do next is look at how many children a family already has and how old they are. I check when the last surviving child was born. If that is no more than two years ago, the family will not have another child this year. That is not realistic in our times. But in earlier times, breastfeeding was an important part of bringing up a child. And women cannot have another child while they do that. So in the past, the spacing between two births could be two years (also because of pregnancy, of course), sometimes even longer. Actually, this is also a very simple method for family planning. Longer time to weaning means fewer children.

Now, if the youngest surviving child is older than two years, the parents might have another child. However, that depends on their plan for the family. If they are still more than one child below their desired family size, they will have another child. In case a child dies, they could also have another one in this phase. It is somewhat trickier with the last child because the target size is not an integer. Hence I randomly decide whether the family has another child depending on their desired family size, and I do this in such a way that the expected value is the exact desired family size. That means the family might have one or no child now. Afterwards, they will not have another child and family planning is over.

Surely, you could add more realism here. For example, you might also build a probability for unplanned births in, a probability for having twins or even triplets. That certainly plays a role. My maternal grandparents had one son, and were happy with that family size. But then my grandmother got pregnant when she was fourty, not what she wanted. And on top of that, it were twins. I have a different take here, though: One of the twins is my mother, and I am very happy to have her.

Anyway, my maternal grandparents ended up with three children instead of the one they wanted. Actually, my paternal grandparents had one child. I have no idea whether they wanted it that way. If you think low fertility is something new: No, three of my grandparents were born in 1896, one in 1906. The demographic transition here in Germany was basically through in the 1920s.

— — —

There is still a major gap left to be filled: Where does the desired family size come from? Parents may know about this in some way, but how do their children learn it from them? Obviously, this is nothing you ever hear about.

On the aggregate level, I found a relationship that looks like this:

ln(actual fertility) = ln(replacement level) + ln(target size/actual size)/0.3.

Since the size of the population is not observable, I have to replace it with something that is observable. What I take is local population density around a family. To this end, I simply count how many people live closer to them than a certain distance (set to 0.1). That leads to the formula:

ln(actual fertility)
= ln(replacement level) + (ln(target density) — ln(actual density)))/0.3.

where I have also used that the logarithm of a quotient is the difference of the logarithms. We now have one quantity, actual density, but the others are still unknown. Especially, the target density and the replacement level are not directly accessible. But as will turn out below: You don’t have to know them at all to pursue a fertility that leads to the stabilization of the population!

Actual fertility is simply the outcome of what parents do when they have a family, at least as an expected value. You can get a grip on this for your parents by just looking at your family, namely at how many surviving siblings you have. What I propose now is that you look at that when you are 16 years old, ie. before you marry, and add one for yourself. Then you have to take the logarithm of that and use it as a proxy for the logarithm of actual fertility in the above formula. I take surviving children here because you might not know about siblings who died before you were born.

But how about the target density? You could back that out if you knew about the replacement level and the actual density that your parents faced. However, that is impossible by definition because we are talking about a time when you were not even born. If you knew this quantity, you could calculate the target density and make it your own. But then as noted, you don’t even have to do that. The important point is the actual population density your parents were reacting to.

Here is how you can find out about that:

When you are eight years old, you estimate the population density around you at the time. Then you do the same when you are sixteen years old. In both cases you take the logarithm. I assume that in reality, this would work on a continual basis, and not just at two points in time. This should make the estimate more precise. However, it already works in this way.

The nice property of logarithms is that they turn exponential growth with a rate into a linear function with a slope. What you are interested in is the non-observable population density at age zero when you were born. If you now assume regular growth for the population, you can calculate the logarithm of that simply by subtracting the difference of the logarithms of your density estimates at age 16 and at age 8 from the logarithm of your density estimate at age 8. Instead of going forward with the difference from 8 to 16 years, you go backwards from 8 to 0 years.

If you now knew also the replacement level, you could first back out the target size of your parents, then plug it into the formula for yourself and calculate your desired family size. However, as already alluded to, it is much simpler than that.

You have two parallel equations for your parents and yourself of the form:

ln(actual fertility)
= ln(replacement level) + (ln(target density) — ln(actual density)))/0.3.

If you assume that the replacement level is pretty much fixed as it should be for rather stable mortality, and your target density is the same as the one your parents pursued, you can just subtract the two equations from each other and cancel the awkward terms out:

ln(actual fertility_parents) — ln(actual fertility_children)
= (ln(actual density_parents) — ln(actual density_children))/0.3

You are not really interested in the target density or the replacement level anyway, only in what family size you should pursue to do the same thing as your parents. And now you have all the necessary data. All you want to know here is “actual fertility_children,” ie. your desired family size. We can sort for that, which leads to:

ln(actual fertility_children)
= ln(actual fertility_parents)
— (ln(actual density_children) — ln(actual density_parents))/0.3

This is a very simple formula, and to recap you get your data in this way: You calculate “ln(actual density_parents)” from two density estimates at age eight and sixteen. You estimate “ln(actual density_children)” yourself on the spot. And you obtain “ln(actual fertility_parents)” from just looking at how many surviving siblings you have at age 16. And then you know what your desired family size should be for a given population density around you. All you have to do is eventually exponentiate “ln(actual fertility_children).”

However, that will be a real number and you can only have an integer number of children. That’s why I implementit as already alluded to above: If the family is still more than one child short, they just have another child. The critical question arises when they come close by less than one child to the desired family size. I then randomly choose whether a family has another child or not, so that on average it works out as the desired family size. They cannot do this year in and year out because sooner or later they would have another child. That’s why I end family planning at this point.

But then there are two people involved. Who gets decide? Both of them could take their formula. Since I think it is a good idea to pool information, I take the average of the estimates for “ln(actual fertility_parents)” and “ln(actual density_parents)” for the husband and wife. I guess that is not too unrealistic for how decisions about the number of children come about. But then it should also work if one side gets to decide, only with a loss of valuable information.

There is another potential problem here: A wife moves in with her husband. Now, she usually comes from a different location and has a different population density in mind. How to handle this? If the wife came from a thinly-populated region and she insisted on sticking with a low target for population density, the consequence would be that she would balk at having any children at all with a higher population density. If she moved to a more thinly populated region, she would instead want very many children.

My hunch is, though, that that’s not how it works. So I assume that the wife adjusts here estimate for “actual density_parents” to the new environment. Basically, what she does is estimate population density in the old location and in the new location when she moves. Then she takes the logarithm of the quotient, or the difference of the logarithms, which seems simpler to calculate. And finally she adjusts her old estimate for “actual density_parents” by that, ie. adds the difference. What that means is that she treats it as if her parents had lived in the new location.

You could add more realism here. I think people not only look at their core family, but also at the families of their relatives, of their peers, and just of general people around them. While my hunch is that the information from your family plays the main role, I would think that there is also a lot of herd behavior here, which would be good because it makes estimates more stable if you average out over more information. Yet, even with this simple setup, I can get very good results. So any such flourishes should only make them better.

And if you think about a species with fewer capabilities than humans, maybe such information is just not available. However, what seems eminently doable is to get your hands on density estimates when you grow up and now. I have written about how many species can do this or might do it here. And to keep track of how many surviving siblings you have while you grow up appears also easily within reach.

— — —

Up to this point, this is all totally theoretical, and it may seem as if it could never work. But it does, even so well that I was surprised. Let me present the results now.

There is a technical problem initially to get a population going. I also had this in the simpler model that I used before. I begin with 500 people who are all born in the same year. That’s a pretty small population, which must introduce massive errors into all estimations. Since everybody is born at the same time, there is a massive momentum effect initially, ie. very fast population growth.

With this setup, that should mean that people perceive very fast growth and deduce from it that their parents were pursuing a pretty low population target. Hence they would react with very low fertility and that would then lead to an implosion. It should take some time for the population to get to some kind of a steady state from there. But this is more or less an artifact you can ignore.

At its low, the population goes down to as little as 50 people in one simulation I did. That often means there are no children in a year at all. Still, over the longer run, and I simulate this for 10,000 years, things get very stable. Here is the population size over time:

The reaction is very drastic at the start. Then very slow population growth sets in. From the year 1000 to the year 2000, the annual growth rate is less than 0.1% on average. I am unsure how to explain that, but assume that this is so because the system is far from equilibrium. Presumably also the search for spouses might play a role, at least in the initial phase. With very few people, it is hard to find someone, and then it gets easier with population growth, which could then get baked into the target size. The random placement of people at the start also leads to an underlying graph, and my hunch is that hat could play a major role here.

This is a loose end: I have to think all this through,take this only a preliminary explanation. I also have to admit that I don’t understand whether the rule for finding spouses plays a role in stabilizing the population at some level. A simulation with an initial size of 1000 appears settles down on a relatively higher level around the starting level. That would contradict that the search for spouses plays a role for stabilization. But then this may also depend on random effects in the initial phase or the underlying graph. The eventual size can vary a lot. Sometimes the population also dies out during the initial phase. I don’t know how all this works at the moment, but show you a fewer examples what can happen.

Anyway, after the year 2000, which seems to be when the population reaches some kind of a steady state, hardly anything changes anymore. The absolute population size in 2000 is 166, in 5000 it is 199 and 192 in 10000. Population growth from 2000 to 5000 is minimal, with 0.0058761% annually. And from 5000 to 10000, there is almost imperceptible shrinkage by -0.0006154% annually. The population remains stable for thousands of years around a size of about 200 people.

I would expect a slight drift here, though I did not expect such a slight one. Errors should induce a random walk for target sizes. Since there is no memory beyond the last generation, that would have to be persistent. That makes it even more remarkable how little change there is. Any rule that would lead even to marginal growth or shrinkage would mean the size of the population will explode or implode over thousands of years. Effectively, the rate of growth is zero here.

For comparison, here is what happens with an intial population of 1000. It takes longer to reach a steady level, but that is then pretty stable for millennia:

However you can get this longer buildup also with a size of 500. The above example is just one realization, and there is a lot of variation for the eventual level as well as for how long it takes to reach it. In the worst case, the increase over the first two millennia is not representative here. As noted, I will discuss further examples below.

— — —

Let’s now look at some structural data for the population. The first is for realized fertility. I plot it for cohorts by year of birth, so to fit it to the underlying fertility you might have to move the time series a little to the left, namely by a generation length. But then this would be hardly visible at this scale.

As you can see, there is a lot of noise in the process. Since the population is so small, there are cohorts with no people and hence also no realized fertility. Mostly this is only about very few couples that have childen in a year, often only a single family. That’s visible in the discretization to integer sizes, mostly from zero to four, but sometimes also beyond. Actually, a maximum of eight comes close to what applies for actual populations. Despite the extreme noisiness, the population hits the replacement level with extreme precision on average over millennia. That is even a level that the population has no explicit knowledge about.

Here is another view on the population: How many of those at least 18 years old are married in a year, including those who are widowed:

The blue line is the percentage for women, the green line for men. As I took parameters where it is rather easy to find spouses, the percentage is very high for those who are married. Yet, as you can see men tend to be more often unmarried. For those alive, the number of married men and women must be the same, though. So the difference results from those who are widowed: the red line shows their share for women, and the light blue line for men. Since men marry somewhat later and tend to have younger wives, they more often leave widows behind. And for women it is the other way around. Note that I assume equal mortality in my simulation, so this is not about longer life expectancy for women than for men.

Again, for comparison also the results for an initial population of 1000. First realized fertility, which shows less volatility and clusters around the replacement level:

Next the results for the share of those married including those who are widowed: women in blue, men in green as well as the share of widows in red, and widowers in light blue.

Results are smoother with more people, but basically follow the same pattern. As it seems, even after millennia there are some very slow developments going on.

— — —

Let me show you now a few additional simulations, again with 500 people at the start. As noted above, a population can easily die out from the severe shock at the start. But what is curious is that once it has made it through the onslaught, it persists for millennia even at very low population sizes. Here are some examples:

The development is similar to the case above. There is a small dip in the middle, but that gets corrected later. Despite a slight trend upwards, there is hardly any growth. If it seems as if this were simple: No, just 0.1% growth a year would mean that a population grows by a factor of 148 over 5000 years.

Here is another realization with a minimal drift upwards later on:

Note that the populations go down to just about fifty people over the first few centuries. Inspite of this very small population size, they do not die out, which would be easy with some random drift. But then I only show examples where the population does not die out, so there is a certain survivorship bias here.

Next comes an example where the implosion is even worse:

At the low, the population persists at a size of sometimes less than twenty people. Then there is a buildup, but over the long run the story is stabilization.

Now, an example of a population that remarkably remains at a very low population sizes for millennia:

At the low, the population shrinks to perhaps a dozen people. But that does not do it in. Instead it makes a comeback, and then levels off around a population size of just one hundred.

And here is an example where the population size remains even lower for millennia and rarely gets past just 100 people:

And finally another realization with 1000 people at the start. With that size, the population pretty regularly survives the initial shock. So far I have not had a run where it went extinct. But then I have fewer of them because the simulation is slower:

Here the stable level is not around 1000, but 400. But then it is also very stable with only a marginal drift upwards over millennia.

— — —

Now for a round of different analyses: One way to study an unknown system is to send in shocks and see what happens. Fortunately, these are not real people, so I can experiment on them. The first thing is to suddenly ramp all the reference population densities up that were inferred for the parent generation. I do this after 5000 years and by a factor of 1.5. My reasoning here is that after so much time, the population is perhaps close to an equilibrium:

As you can see, the population reacts as expected. There are also the oscillations around the higher level that I had in my simpler model. That goes on for centuries. However, what I find curious is that over the longer run, the population size reverses to a lower level and stabilizes there. I have to admit that I do not understand how this works. Note also that the reversion may be faster, even much faster in other simulations.

Here is what the development looks like for realized fertility:

There is a predictable peak for fertility, but the rest is hard to spot. I do not show the graph for the share of married and widowed people. Despite the massive shock, it looks pretty much the same as without it. I would conclude from that these assumptions do not matter for what happens.

Next I do the same thing for a negative shock by a factor of 0.5. Here is the development for the population size:

I am a bit at a loss why the decrease is so small. But then there seems to be a fast counter-reaction with high fertility that may explain it. Again, there are some oscillations for a few centuries, but the population size then reverts to a level somewhat lower, but maybe in line with the level before the shock. In other examples the population reverted even faster, so the effect was hard to spot.

Here is the situation for realized fertility where you can mostly only see the higher fertility that leads to the bounce back:

As in the previous case, the chart for the share of married and widowed people is unremarkable. It is impossible to see any effect at all, at best a very minimal one.

I have also run a parallel simulation with 1000 people at the start. Here is what I get for the positive shock:

The result is similar, though much smoother with a larger population size. The retreat from the higher level seems to be slower and the population remains at an elevated plateau for longer. However, that may only be accidental for this one simulation. The outcomes for the other analyses are as in the case for smaller populations.

Let me focus in on the time period from hundred years before to nine hundred years after the shock instead:

It is also easier to track what happens with realized fertility on such a smaller time scale:

You can seen the oscillations for realized fertility here after the shock. They then run out into a noisy behavior after about half a millennium. So much for now, have to look deeper into this.

And finally, because that’s only appropriate, I would like to say a big thank you to the people who have developed Eclipse, the compiler I use to do the programming in Java, and the people who have developed Scilab that I use for the visualizations. Those are really great programs, and I very much appreciate that they are freely available. This is incredible. Also how fast Java is and how much my notebook can handle: Hundreds of thousands of objects with ease, just amazing! I could do all this stuff in a single day.

— — —

Conclusion

To sum up: I have a model that qualitatively captures a lot of features of actual populations. I assume a very simple rule for stabilizing the population that people can apply only with easily accessible knowledge. There is no need to know the replacement level, no need either that parents tell their children about their target for population density or even the size of the whole population, and all calculations are very simple once you grant that logarithms and exponentiation as well as a random choice for the last child are available.

Despite a very small population size of only a few hundred people and even below one hundred, and lots and lots of noise in the estimation, the population manages to stabilize at a target size and almost perfectly pins the unknown replacement level down. There is some random drift for target sizes as there should be. Yet, the effect is population growth or shrinkage that is almost zero.

Note also that the population does this without any further input. Population density has no positive or negative effects here that would push the population upwards or downwards. It could grow or shrink to an arbitarty size without any constraints. With so much time, even slight growth or shrinkage would lead to huge population sizes or extinction. Neither happens over many runs of the model. The only exception is the severe shock at the start that can bring a small population to its knees.

I am still baffled how well this works. Given the long derivation apriori, I expected stabilization to be much harder, especially for a small population, and even much more unstable than it works out. I was expecting random drift over very long time periods. But it does not look like there is much there. Instead the population remains stable for millennia. I assume that the eventual sizes are either a long run effect of random developments at the start although I find that unlikely. Or the apparent attractor for the system has something to do with the underlying graph for where people live. I have not yet looked into this, but that would be my first guess where to look for an explanation. Or it is something else that I miss at the moment.

The whole process is very simple, so simple that not only humans, but many other species could pursue it. All it takes is that you are able to estimate densities over time and figure out how many surviving siblings you have, which seems possible even for pretty dumb species. And of course, there is also no reason to assume any conscious calculations here. You could embed them “in hardware.”

Not only would stabilization be valuable for a population, what I find even more remarkable is how extremely small populations can stick around for millennia without showing the least inclination to go extinct because of some random drift. If I were to design a species, I would find that very attractive because it means the species will be around for a very long time.

I hope my model settles a certain warranted skepticism that a population just cannot stabilize as this would demand too much information and also complicated math that are not available. Actually, if you aggregated the behavior of such a population, you would get the same formula that I found for South Korea and that also worked so well for Japan.

This is, of course, no proof, but at least I have a pretty good candidate here how it could work. I view this as a “proof of concept.” It can work. Very probably an actual mechanism would be somewhat different, and it would also have to include inputs that make a population grow or shrink at times plus some feedback from conditions to steer it to a reasonable level. But then I think this is a good start. I hope it is clear that I make no claims the results can serve as very long run predictions. Too much is still open here.

— — —

PS

In the original version of this post, I also reported data for average age at marriage. However, I had an error in my program. What I calculated was not the average age at marriage for those alive at some point in time, but for all those that had ever lived up until this time. That also included people who were long dead. And since the respective population always grows, things got smoother and smoother over time. While the result was correct, it was not what I thought it was. I have deleted the graphs, which were not essential and also comments on them. I will work more with the model and then also supply the data I wanted to have here.

PPS

See my next post for this.

--

--