Step Aside Credit Score: How traditional insurance pricing helps maintain inequality

Raja Chakravorti
Root Enterprise
Published in
13 min readNov 2, 2020

by Raja Chakravorti and Kyle Schmitt

The struggle for financial inclusion

The concept of financial inclusion has re-emerged recently, as racial disparities — and the impact they have on economic and social inequalities — have come under renewed scrutiny. Access to affordable financial products is critical to the fundamental creation of wealth down to the individual level. But historically, access to financial services has been severely limited for many minority populations for unfair reasons.

Redlining, for example, is a discriminatory practice by which financial services were made unavailable to specific groups through either the outright denial of services or by selectively raising the prices for those services. This mechanism has amplified wealth inequality on a multi-generational level.

The practice of redlining has had a continued effect on segregating people by location. Though diversification has occurred, many neighborhoods continue to have socio-economic makeups that index towards homogeneity.

Redlining has had a severe impact in terms of capital access as well — by creating a cyclical negative trend on home prices, savings, and public funding for local services like schools; disincentivizing retail investment, and more.

And while redlining was made illegal through the 1968 Fair Housing Act and 1977 Community Reinvestment Act, credit score trends continue to reflect these generational disadvantages on communities of color in the United States. In fact, the legacy of redlining continues to amplify disparities, even as the country becomes more diverse.

Today, credit score is a fundamental driver for how consumers gain access to capital and financial services in the United States. Credit history functions as a gatekeeper to all forms of lending, and is a key mechanism for pricing insurance services.

As such, a person’s FICO® Score is deemed to be a measure of their financial health. Individuals with a high FICO Score are deemed to be a good — or safe — risk to repay loans.

These individuals tend to receive lending benefits in the form of low premiums or better rates for accessing financial services. Conversely, a low or nonexistent FICO Score can effectively deny someone access to these same services, causing premiums to skyrocket relative to the financial benefit received by a loan.

Those with low or no FICO Scores often have lower overall incomes–so, although access to cash would be beneficial to these individuals, they are denied that critical opportunity.

With access to affordable credit lacking, these individuals are unable to build their credit histories, which further blocks access to financial services that could lift them up in the future. Because credit history is a critical component of a healthy credit score, populations such as new immigrants begin at the bottom of this cycle and may struggle to make their way out.

These historical biases towards minorities have compounded over generations, unfairly limiting entire communities from acquiring critical financial services.

These effects are still being felt today, and their impact extends beyond credit access, home values, and educational opportunities. They have reached into insurance and other methods of protecting against financial loss, at times creating a multiplied negative effect.

Is credit score an accurate measure of risk?

To illustrate the problems associated with the traditional financial mechanisms described above — as they relate to analyzing and interpreting risk — we’ll focus on auto insurance pricing.

In our first paper of this series, Step Aside Credit Score: How Mobile Telematics Has Revolutionized Car Insurance, we explored how mobile telematics provides a better and more accurate predictor of risk in pricing auto insurance.

Traditional insurance rating factors, such as credit score, ZIP code, and income level, afford little opportunity for drivers to take control of their insurance premiums. Consumers are unable to alter their demographic information, and it takes a concerted effort over time to raise low credit scores–the origins of which are often tied to unfair factors in a consumer’s life.

The application of telematics to insurance turns the outdated model on its head. It does this by informing consumers of their primary risk factors and empowering them to make changes to their driving habits to reduce their risk — thus resulting in lower insurance premiums and greater savings.

In this second paper in our series, we attempt to understand the correlation between FICO Scores and actual driving behavior using anonymized data assembled by Root Insurance.

Is it possible there might be a better means of understanding credit risk? Could measured performance behaviors, such as driving scores, work to build a more accurate and fair profile of customers?

Driving score versus FICO Insurance Score

To illustrate the predictive power of telematics relative to credit information, we’ll utilize Root’s aggregated and anonymized data comprising more than 700,000 drivers who use the Root app.

Throughout this section, we compare Root’s driving score to the FICO Insurance Score. While the FICO Insurance Score is not the same as the FICO credit score that has been the standard for loan and creditworthiness since the 1990s, it relies on much of the same information from a consumer’s credit report. However, instead of predicting risk of default, the FICO Insurance Score predicts auto and homeowner insurance losses.

Correlations

Right away, the relatively weak correlation between the FICO Insurance Score and Root’s driving score should lead one to question the efficacy of the former in predicting auto losses. While driving score is made up of factors causally related to someone’s risk of incurring an auto claim, the credit-based score is purely correlative.

Root’s data suggest that a huge portion of the population with low and medium credit actually do not exhibit significant differences in driving behavior. There is, on the other hand, evidence that individuals with high credit-based scores trend towards being better drivers.

Similar patterns are observed within different subsets below, for instance with 25-year-olds (left image) and in the state of Georgia (right image).

The bottom line: credit is only weakly correlated with driving responsibility. Within the subpopulation of low-credit individuals, there is a significant portion of very good drivers.

Credit invisibility

What’s more, including so-called “no hit” (NH) or “no score” (NS) users, who lack sufficient credit data to be scored numerically, reveals that they have driving scores similar to half of the credit-visible population.

This is a powerful result. The credit-invisible population in the U.S. is at a significant disadvantage — as chronicled in the introduction of this white paper — but if driving score is accepted as a proxy for responsibility, this population has risk tendencies that mirror the median citizen.

Risk segmentation

The results above are certainly interesting from a consumer perspective, but what about the insurer?

  • Is telematics really better at predicting future loss?
  • What does an insurer stand to gain by adding a telematics program?
  • With a telematics program, what would an insurer stand to lose by dropping the FICO Insurance Score from their rating plan — as Root Insurance is committed to doing by 2025?

Before we examine the performance of driving score and FICO Insurance Score, here’s a brief refresher on risk segmentation.

Loss ratio is the ratio of the claims paid to premium earned. Loss ratio relativity (LRR) is the loss ratio of one entity or subpopulation relative to the full population.

Axiomatically, the LRR of the population as a whole is 1.

The fundamental purpose of risk segmentation is to identify new features that can separate the population into different risk groups; specifically, those that the current rating model (the “on-level model”) is underpricing or overpricing.

Unpredictive or random features are those that fail to generate different LRRs in different segments (such as shown here).

Predictive features result in segments with higher or lower LRR. Such features are said to generate lift. A plot demonstrating LRR as a function of different segments is called a single lift chart.

Lift charts are often shown in five or ten segments. For simplicity, we illustrate just three segments.

Analysts and actuaries typically evaluate a feature based on net lift (the span between the lowest and highest risk segments). They also look at how risk tends to trend up or down consistently across the feature.

As it turns out, the comparison is not particularly competitive — Root’s driving score factor segments risk far more effectively than the FICO Insurance Score.

The single lift charts below demonstrate segmentation of loss ratio (before inclusion of scores) for FICO and driving scores separately. While both curves are roughly monotonic, the lift afforded with telematics is three times steeper.

Insurance analysts often assess the head-to-head performance of two models using a double lift chart. Here, we sort all policyholders by the ratio of one model to another — in this case, the driving score factor divided by the FICO factor.

On the right-hand side, we find policyholders whom the telematics model identifies as far riskier than FICO shows. On the left-hand side, we find policyholders whom the telematics model identifies as far less risky than FICO. This plot illustrates approximately 40% of policyholders that FICO alone would aggressively overprice (left side) and ~20% of policyholders that FICO alone would aggressively underprice.

In total, a strong telematics rating factor, like Root’s industry-leading offering, is 10–20 times more correlated with future loss per ECY than FICO.

The addition of a telematics factor to an on-level rating plan can be expected to increase lift by upwards of 30% (likewise for Gini index, another common risk segmentation performance metric). The FICO Insurance Score might add a few percent. This result is repeatable across different coverages, customer segments, and product lines.

The data confirm that telematics is a better tool for segmentation.

Fairness index

Too often, however, insurance companies anchor purely on segmentation performance and fail to consider what implicit biases might be introduced. Does telematics stand the test of fairness?

Below, we propose a fairness index to enforce the standard that a new insurance feature creates opportunity for safe drivers, irrespective of race, age, or financial means. Then, we hold telematics to that standard.

In the previous diagram, we looked at the whole population. In the whole population, each risk group has a uniform number of policies by definition.

Isolating to a specific subpopulation, we might find uneven exposures in the risk group, as illustrated to the left by different numbers of policies in each risk group.

When this occurs, the new factor would result in an increase (or decrease) in the average premium for that subpopulation. For example, in the illustration to the left, the subpopulation shown has an effective loss ratio above 1, or a net surcharge.

We propose a fairness index that is equal to the average discount afforded to the safest half of a given subpopulation, when introducing a new rating factor to an on-level rating plan and pricing to its indication on the full population.

In other words, this is the discount the safest half of the subpopulation could earn if they weren’t implicitly subsidizing the riskier half of their subpopulation (owing to the missing factor).

In the first example below, we look at the addition of driving score to a rating plan for a subpopulation of policyholders who are at least 60 years old.

Each point on the red line represents the on-level loss ratio relativity, or indicated discount, for five risk segments ordered by driving score. Each bar represents the exposure of the subpopulation in each risk segment.

Not only is driving score a powerful segmenter of risk, but this subpopulation also skews toward better driving. This factor would lead to a 35% average discount for the safer half of the subpopulation (as segmented by their driving behavior).

In the second example below, we look at the addition of a mileage-only-based factor to a rating plan for a subpopulation of users from ZIP codes with fewer than 50 persons per square mile.

Mileage not only affords weaker segmentation, but the rural population also tends to skew toward more mileage, which, holding all else constant, is empirically related to more risk. This factor would only lead to a 6% average discount for the safer half of the rural subpopulation (as segmented by their mileage).

By aggregating this concept over several commonly accepted car insurance pricing factors and across several traditionally disadvantaged subpopulations, we arrive at the results below (based on property damage coverage loss and premium for Root’s historical policyholders, excluding exposures impacted by COVID-19). For each subpopulation, telematics earns the best fairness index.

On balance, driving score creates far and away the most opportunity for the least risky drivers within different subpopulations.

A fairness index alone isn’t enough

While the fairness index concept is useful in illustrating the power of telematics for both the consumer and the insurer, this alone can’t ensure fairness. Additional considerations must include:

  • Causation versus correlation: As an ideal, insurers should strive for causal relationships between risk propensity and the factors used in pricing. Driving score is emblematic of this principle. It includes behaviors like hard braking and speeding that can be causally related to accidents.
  • Transparency and modifiability: Insurers should also strive for factors that the consumer understands and has the ability to control. A well-run telematics program keeps the customer informed of measurable risk factors — like distracted driving and driving during dangerous times — and coaches them to improve.
  • Privacy: One of the clearest challenges for telematics is its privacy implications. One step insurers can take to mitigate these concerns is to develop a telematics score that does not require access to location (GPS) data. Furthermore, insurers should be upfront about what data they are collecting, why they are collecting it, and how that data will be stored and protected.
  • Access: Insurers should strive for factors that can be collected reliably across a broad swath of the population, especially when a factor presents an opportunity for a discount. Credit availability, for instance, poses issues for using FICO in insurance.
  • As of 2020, in the U.S., more than 80% of the driving population has a smartphone and ~20% of registered vehicles are sufficiently connected to power a modern telematics program. Still, insurers should continue to supplement with OBD-II-based programs to cover the gap that remains.

To learn more about the additional considerations highlighted above, please read our first paper in this series, Step Aside Credit Score: How Mobile Telematics Has Revolutionized Auto Insurance

A fairer, more accurate measure of creditworthiness

Ultimately, financial inclusion is predicated on a number of factors that affect individuals.

In this paper, we’ve shown that the fairness index concept is a powerful and more equitable way to refine how companies provide auto insurance to consumers, when looked at in tandem with existing factors. Similarly, it’s possible that this concept could apply broadly to other areas of financial well-being.

For example, new immigrant populations tend to have less access to affordable credit cards because their FICO Scores may preclude them. But many neo-lenders have developed credit products in recent years aimed at leveling the playing field for unbanked or underbanked populations.

The concept of a fairness index based on actual behavioral risk factors might be beneficial for understanding risk segmentation more granularly in these populations, thus creating greater financial wellness for broader populations — and making financial inclusion one step closer to being a reality for everyone.

Learn more about Root Enterprise here.

Technical considerations

The results of this report are built on a temporally windowed subset of data across hundreds of thousands of Root Test Drives and Root policyholders for PD and COLL coverages and all product lines.

Policyholders with accounts created prior to January 1, 2019 are excluded (although, results were confirmed to hold for early policyholders). Loss exposures after March 1, 2020 were excluded. Loss exposures are 5-month developments. Only single-driver, single-vehicle households were included (although, results were confirmed to hold for multi-policies). Results involving loss ratio use Root’s latest on-level rating plan.

Factors for FICO Insurance Scores, violation points, and YMM were based on Root’s latest on-level rating plan. The mileage factor was built based on annualized mileage prediction from Root’s test drive (so is likely weaker than would be afforded through a continuous connection).

The subpopulations studied in the fairness index section include:

  • Black-majority ZIP codes: having more than 70% Black populations according to U.S. Census data
  • Hispanic-majority ZIP codes: having more than 60% Hispanic populations according to U.S. Census data
  • Low-income ZIP codes: having a median household income below $20,000 according to U.S. Census data

These thresholds were selected to give roughly 10–15,000 users per subpopulation, while also remaining representative of a protected or sensitive attribute. The results of this white paper do not change under modifications to these thresholds.

Root does not collect information about race or household income from its quoted population. As such, this analysis relies on ZIP code as a proxy for these factors. Although this is imperfect, the authors of this report believe all key conclusions would hold up under refinement.

Root’s telematics score is believed to be best in class. Other insurers may not be able to reproduce the degree of these results, if their driving score is less powerful. Still, the authors firmly believe that credible telematics programs will achieve a driving score factor with a superior fairness index.

Results are based on analysis of policyholders, including biases incurred by their bind tendencies. To obtain answers to the counterfactual questions posed in this white paper, results would ideally be produced on policyholders whose quotes did not use the explored factors. Unfortunately, a credibly sized data set of policyholders priced without using driving score, territory, mileage, FICO, YMM, and violation points was not available.

--

--