American Lifespans Suddenly Dropped Between 1800 and 1810 But Nobody Knows Why  

The first study of data from online genealogy websites reveals remarkable changes in average lifespan since 1650 that leave demographers scratching their heads

Genealogy—the study of family history—has undergone a revolution in recent years thanks to the web. Numerous websites have sprung up all over the world that catalogue the familial links between millions of people dating back hundreds of years.

That’s an extraordinarily rich dataset. Because it records each person’s birth and death date, it holds information about lifespans through the ages and how these correlate across families from generation to generation.

Strangely, these datasets have received scant attention from the research community. Until now.

Today, Michael Fire and Yuval Elovici at Ben-Gurion University in Israel say they’ve mined the data on one of these genealogy sites, WikiTree.com, to reveal fascinating insights into the nature of lifespan and how it varies with factors such as the lifespans of other family members.

But they’ve also uncovered a puzzle. The data reveals short-lived variations in lifespan that affected men and women differently. Exactly what caused these variations, nobody knows.

WikiTree is a free, collaborative, family-tree website containing profiles of over a 360,000 individuals who were born in the US since 1650. The profiles give the gender, marital status, dates of birth and death as well as links to profiles of other family members, such as spouse, children, siblings, parents and so on.

Fire and Elovici used this to create a standard social network of the familial links between people, revealing the structure of nuclear families and extended ones. They then grouped individuals according to their year of birth. This allowed them to calculate the average lifespan for men and women for each year of birth since 1650.

That throws up some interesting results. It shows that if an individual survived beyond the age of ten, he or she would probably live beyond 60, even in the seventeenth century. For example, the median age in 1650 of people who outlived the age of ten was 61.89 for men and 63.04 for women.

The death rates below the age of ten show some inconsistencies with other data sources, however. In 1900, this death rate is known to be about 15 per cent but the rate from WikiTree is substantially lower.

Fire and Elovici speculate that this is probably because there was no formal definition of a “live birth” in those days. So many births classified as live today would have been classified as dead in those days and so not recorded in the WikiTree data.

Their data mining reveals some interesting correlations. They say there is a small but significant correlation between children’s lifespans and their parents. However, this disappears after a generation.

That may be the result of some genetic dependence for lifespan but equally social factors may play a role too. Being poor might condemn parents and children alike to an early death, for instance. And there’s no way to tease factors apart in this analysis.

Fire and Elovici also found a small but significant correlation between the lifespans of spouses. If your spouse is long-lived, so too are you likely to be.

The data also throws up negative correlations. For example, they found significant evidence that women who have more children die younger.

Perhaps most interesting are the mysterious variations in lifespan that the WikiTRee data throws up. There are several years when the lifespans of both men and women decline sharply. Men and women born in 1800, for example, have an average lifespan of 66.21 and 64.66 respectively.

But by 1810 this had dropped by three years to 63.68 for men and 61.55 for women. Why this happened is a mystery.

What’s more, at other times the data shows male lifespans increasing while female lifespans decreased, and vice versa. Fire and Elovici give these examples: “From 1650 to 1660 the male median lifespan increased from 61.86, to 66.81 while in the same period of time the female median lifespan decreased from 63.04 to 60.8. Similar patterns reoccur between 1770 and 1780, only this time the female average lifespan increased from 66.31 to 68.63, while the male average lifespan decreased from 66.55 to 64.29.

But again, the question of what caused these sudden changes in lifespan leaves Fire and Elovici scratching their heads. “We hope to discover underlying reasons for these patterns in our future research,” they say. Readers are invited to post their own ideas here.

Despite this puzzle, and also because of it, this research has huge potential for follow up work. Fire and Elovici have looked at only one online resource. So an obvious possibility is to compare this data with data from other genealogy websites and to use the same methods on family history data from other parts of the world.

Perhaps more interesting is the possibility of analysing not just the numerical data on the WikiTree site but also the notes and comments left by relatives. This could provide important insights into life and death throughout history.

This is clearly the beginning of a fascinating new form of research. As Fire and Elovici put it: “We believe that this study will be the first of many studies which utilize the wealth of data on human populations, existing in online genealogy datasets.”

Ref: arxiv.org/abs/1311.4276v1: Data Mining of Online Genealogy Datasets for Revealing Lifespan Patterns in Human Population

Follow the Physics arXiv Blog here