Life in Death: Understanding Mortality Data
An introduction to mortality tables and its applications
At the heart of data science is the desire to analyze data for actionable insights. Death statistics is yet another type of data to be considered.
A mortality table, less morbidly known as a life table, shows the death rates of a defined population at every age. These rates are probabilities that describe the expected number of people per thousand that will die in a given year. Mortality tables are often segregated by demographics such as gender, race, occupation or socio-economic status to account for biological and external factors.
The notation qₓ is used for the probability of a person aged x to die within the next year. On the other hand, the probability that a person aged x will survive the next year is given by pₓ = 1 - qₓ.
Another important notation is ₙqₓ, the probability that a person aged x will die within the next n years. Its counterpart ₙpₓ is the probability that a person aged x will survive the next n years.
The data used for this article was taken from the latest general mortality tables from the Human Mortality Database, which contains open source datasets on the mortality in developed countries.
Exploring Mortality Tables
We can plot the data using R to observe any general trends. We use the logarithmic transformation of the mortality rates to reduce the skewness of the distribution.
By examining the mortality rates of 21-year-olds through the years, we notice a declining trend. These mortality reductions are likely attributed to advancements in medicine and healthcare technology, notably the discovery of penicillin and sulfa drugs in 1935. Also, the gap in the plot between years 1914 to 1918 is due to the lack of mortality statistics during World War 1.
If we plot mortality rates by age, we observe that after noticeably high infant mortality rates, the likelihood of death steadily increases as people get older, as we might expect.
The Concept of Present Value
Analyzing mortality data can offer numerous insights on population health and human longevity, but its most common application is in valuing and pricing life insurance.
Before we delve into the common types of insurance products, we must first understand the concept of present value. Present value, as the name implies, is the present or current value of a given investment. It is found by “backtracking” or discounting every future cashflow by the interest rate.
The general equation for the present value is given by
where PV is the present value and i is the annual interest rate.
See the example below of a present value calculation in R. Note that it is standard notation to define v as:
Applications of Mortality Data in Insurance Valuation
Now, we can use our knowledge of mortality tables and present value to find the expected present value (EPV) of the following life insurance products. The formulas may seem intimidating at first glance, but their general form is the same.
For a person aged x, the EPV is the total sum of the annual discount factor multiplied by the probability of surviving up to year x + k and the probability of dying during year x + k.
‣ Whole life insurance — insures a person for their entire life as long as they continue paying a fixed amount of premium every set time period.
‣ Term life insurance — insures a person only for a specific length of time. If the insured person passes away during this period, a death benefit will be paid to their beneficiaries. Otherwise, the policy expires with no payment received unless the person opts to renew the policy.
‣ Deferred life insurance —no/limited benefit is paid if the insured person dies during the first u years. Beyond that, the policy acts as a whole life insurance.
We have barely skimmed the surface of the characteristics and uses of mortality data. For further study, you might want to read up on the lifecontingencies package in R.