An AI that understands insurance claims

4 min readOct 23, 2017

There is something really cool about detecting fraud for insurance companies: each alert that we send helps claim handlers to monitor suspicious behaviors, fight fraud and reduce the claims loss. As an AI company specialized in insurance claims processing, we often raise astonishment in the tech world about this focused positioning. We don’t indeed help millions of people sort their pictures or find new restaurants in their area. We help thousands of claim handlers to better do their job, and we are proud of it.

Today, we are thrilled to announce our first leap beyond fraud detection. As head of our research team, I am particularly excited to describe our first prototype for automated claims processing based on scanned documents.

In many lines of business, claim handlers must read, analyse and sometimes manually enter data from scanned documents to process their claims: cost estimates in home, flight tickets in travel, invoices in health, medical certificates in workers compensation, payment statement in incapacity or death records in life insurance. These tedious tasks represent an important workload, are prone to many errors and mostly do not bring much added value. The ability to automate them would thus be a game changer for insurers.

We took the strategic decision to tackle this challenge at the beginning of the summer. We knew it would not be an easy challenge. Even with all the recent advances in AI, it cannot be an easy challenge. Otherwise at least one AI company would have done it.

After a few weeks testing state-of-the-art approaches, we were able to grasp the size of the challenge. Processing scanned documents to assess insurance coverage requires much more than a simple OCR (Optical Character Recognition) tool. The latter — when accurate enough — gives the ability to read such documents like a child would do. A child would not be able to deduce however if a claim is covered and how much should be reimbursed when reading an invoice or a payment statement. For that purpose, he needs to understand the document, that is to say grasp the meaning of each different date, amount, name, etc. This already important challenge is amplified when scanned documents are noisy and mostly when they don’t have a predefined template. This is how data from the real world is, and any solution must simultaneously overcome all these challenges otherwise it is useless in practice.

After three months of iteration between practical optimizations and theoretical developments, we have managed to design a generic algorithm fit for this task. It relies on an extension of deep learning backpropagation to optimize the classification with reject option problem. The latter means that the model is allowed to say “I don’t know” when it is not confident enough in its prediction, enabling him to be much more efficient in its learning on cases where it predicts. As a result, after only a few days of training, the AI was able to understand the documents.

This model is also best adapted for an operational use: if it predicts, it can be trusted and thus automates the processing of the claim, otherwise the claim is rerouted to a claim handler. Its performance is therefore naturally measured through two KPIs: its accuracy when it predicts and its prediction rate. A model that predicts for all claims but makes too many mistakes cannot be trusted and on the contrary a model that is always correct but predicts for a small fraction of the claims is not useful.

Once the concepts are extracted, the module automatically takes the decision to pay and how much based on the parametrized covers and internal handling rules of the insurer. Such an automation amplifies however the risk of fraud. Fortunately, the module uses our Force™ solution for fraud detection, with some specific extensions tailored for claim automation fraudulent behaviors.

We have first tested our AI module for payment protection insurance. This line of business concerns insurance products that ensure consumers repayment of credit in case of loss of income (in case of illness, accident, unemployment, death). In France, the processing of these claims is based on payment statements from the national welfare system. Every two weeks during a claim of usually several months to several years, the insured must send the statement to justify his inability to work. Claim handlers then enter the data from the statements in their system which computes how much the insured was already reimbursed and triggers the associate payment.

On a test set of more than 1M documents, our AI module had a prediction rate of 81% with an accuracy of 97%, even higher than the one of human claim handlers. For a volume of 1M claims per year, this saves a workload worth of $50M and enables to reduce leakage by more than $20M per year.

In addition to these savings, our AI module opens a new opportunity: direct claim settlement for insured. By simply uploading their payment statements, they can have the real-time confirmation that it was accepted and that the payment will arrive. And the beauty of it is that our AI module does not require a full recasting of the insurer’s IT system, it can simply be plugged on top of it, thanks to our optimized API implemented by our developers’ team.

Based on this first success, we have decided to heavily invest in this direction. We have thus started to apply our algorithm to other business lines and the next steps will be to extend it to other types of data, including text claim statements and pictures of material damage.

An AI that understands insurance claims

Written by Éric Sibony