How to price any health insurance product

Stéphane Soulier
Alan Product and Technical Blog
6 min readSep 15, 2020

At Alan, we strive to unlock frictionless, fair and friendly healthcare for everyone, to become the health partner for companies and their employees.

Among them, large companies often have very specific needs in terms of coverage. Those needs cannot be matched with a standardized offer.

By building “tailored” coverage tables, we are able to propose any guarantee based on the company’s needs. That comes with a big challenge: how do we automatically price a custom-made plan?

What is a coverage table?

It describes the insurance product. A coverage table typically contains 40 to 50 distinct guarantees for different types of care “acts”. In France, it needs to comply with the rules published by the National Social Security. All these rules create a lot of subtleties (maximum reimbursement per act, maximum number of care acts per year, …). Modeling them precisely is challenging and requires a per-guarantee approach.

What factors impact pricing?

The price of a health insurance product depends on two factors: the level of guarantees in the product itself, and the demographic distribution of the population to whom the product is offered.

First, pricing is impacted by the level of the guarantees expressed in the coverage table constituting the product:

  1. Members with a premium guarantee level will consult more expensive doctors, which increases the reimbursement cost per care act.
  2. Members with a premium guarantee level will consult more often because they are well reimbursed. The number of care acts per year (= frequency) will be higher.

Second, pricing is impacted by the demographics of the population to whom the product is offered, because the health consumption of each covered employee and their dependents (partner, children) evolves through their stages of life. So, when we cover a company we need to look at the demographic characteristics of its employee population, in particular their age, the male / female proportion, etc. Those characteristics can vary a lot from one company to the next and per line of business. You can observe a good illustration of the impact of the average age in the chart below, computed from our current portfolio data.

Unfortunately, we have access to very little data on the demographic structure of the company until they actually become our customers; so we can only rely on a few aggregated metrics at the time of pricing (average age, gender percentage).

The traditional modeling approach

Computing a price boils down to predicting what will be the overall cost of the reimbursement for all the employees and their dependents for a given company. The standard market approach for this task is to split the problem into two sub-prediction problems.

With the first model, for each guarantee, one would estimate the frequency of occurrence per insured member. The second model consists of predicting the average cost of the care act. Most of the time an actuarial team would use generalized linear regression and exploit all the key drivers as features of the price (guarantee type, age, gender, moral hazard, etc).

This approach works well but it has a few drawbacks. Overall, it is complex (2 models per guarantee type) and requires a lot of data to reach good accuracy. It cannot predict guarantees that are changing (which is happening in 2020 due to a legislation shift).

Finally, it can also only model linear effects, where the price grows proportionally with the underlying factors, which is not always what happens in real life. Consider how the cost of child orthodontics care is related to their age below; it does not follow the shape of a “straight line” indicated by a linear model.

Using a linear model for a non-linear phenomenon is a classic case of model misspecification. The likely result of using such a model to price our products would be to severely overestimate the cost for children aged 15 to 20. This last problem can be fixed adhoc, but it requires manual intervention and increases complexity overall.

At Alan, one of our leadership principles is “Alaners simplify”. We take a step back and start from the first principles to solve our problems. We try to understand and learn from others, and we re-use only when it’s the best for us. Here, the traditional approach proved to be too complex, among other significant drawbacks. We felt that we were in a unique position to harness our infrastructure to build something simpler and more powerful.

Core pieces of Alan’s pricing engine

As described previously there are a lot of subtleties behind a coverage table, but by operating our insurance business every day we have “on the shelf” code that computes the reimbursement of any care act for any coverage table, for any member: the claim engine. We are a tech-first company, and our modular approach has allowed us to re-use that asset to build our pricing tool.

We are able to take any combination of existing member profiles from our portfolio and to run them through our claim engine to compute the amount we would reimburse if they were under the coverage table that we want to price. If the combination of member profiles is close enough to our prospects’ employee population and their dependents, we can compute the price of a product for a company. The underlying hypothesis is that a population with the same characteristics will have the same health consumption. This approach is at the heart of our pricing strategy.

In order to achieve this, we built the smart-sampling algorithm. It extracts from our portfolio a set of member profiles that collectively share the same demographic characteristics as the employee population of the targeted company:

1/ We create a theoretical age distribution based on the average age of our prospect’s employees. Here we only make 2 hypotheses: the distribution is log-normal and the standard-deviation linearly depends on the average age.

2/ We pick random member profiles from our portfolio that altogether will fit this age distribution, so we can use their corresponding claims profile. We proceed the same way for male and female members.

Fitting the pieces together

We put all the engine’s core pieces together:

1/ We collect basic demographic information from the prospect company, as well as the target guarantees.

2/ We use our smart sampling algorithm to select existing member profiles from our portfolio that together, match the demographic information, so we can identify the corresponding claim profiles.

3/ We run those claim profiles through our claim engine to compute how much we would reimburse under the target guarantees.

4/ We use simple math to translate this overall price for the company into a unique price structure for each employee and their dependents (for instance: 50€ per employee and partner + “family pack” at 30€ for any number of children).

In order to increase the price robustness, we repeat step 2/ and 3/ on 50 distinct samples. Then, the final price our sales team gives our prospects is basically the average of the 50 prices. Another good aspect of computing prices for several samples is that we get a confidence interval on our price. Here we just leverage the statistical bootstrap trick.

What’s next?

This pricing approach is a perfect example of how we deal with new challenges at Alan. We do not jump into the classical method, we take a step back, we simplify and we leverage our strengths, making the best of a very strong team.

Nonetheless, this simplification comes at a cost. This method tends to overfit, so the quality of the price estimation is as good as the underlying data of our current portfolio. Fortunately, our portfolio is growing at 2–3x speed every year and having more data will make our price computation more reliable at a very cheap engineering cost, with no model to re-run for parameter estimation. It allows us to offer the best health insurance at the fairest price possible starting the first year.

If you are interested in this kind of challenge, drop us a line at jobs@alan.com. We’re hiring!

--

--