Arimo’s Narrative App with Deep Learning Extension Shows Path to Automate $200B in Medicare Claims Adjustment
A Deep Learning model of Medicare claims data predicts with 85.6% accuracy the total overcharge on 85M claims and visually displays which claims are most likely fraudulent.
When a healthcare provider submits a claim to Medicare, Medicare seldom pays the full amount. In fact, when the Center for Medicare and Medicaid Services (CMS) published its aggregated dataset containing 18M rows of Medicare claims data in 2013, the data showed that on average Medicare only pays about a third (37%) of the amount billed. Yet, while that 37% may be the average, it is not universal. The case for better Medicare claims adjustment is well known, just as high-profile cases of Medicare fraud are well publicized. According to research by The Economist, Medicare and Medicaid pay nearly $1T / year of which as much as $100B is fraudulent. And three of the top 20 of Medicare payment recipients revealed in the CMS data (more than $75M combined) — Drs. Salomon E. Melgen, Asad U. Qamar, and Farid Fata — face Federal Medicare fraud charges.
Dividing the claims into two groups — fraudulent versus legitimate — raises the question of whether data in one group has more in common with its own members than with members of the other group. If the answer is “yes,” then it also might be possible by way of a neural network to pick out the likely fraudulent claims on a cluster map. Furthermore, if a claim’s appropriate adjustment amount is unknown, it might also be possible to estimate it with high accuracy by looking at its nearest neighbors on the map — regardless of whether the claim is legitimate or not. That is the premise we tested using Arimo’s Predictive Engine and Narratives software to build a Deep Learning (DL) model of the CMS dataset. In other words, could all the effort that goes into adjusting claims be done much faster by machine with much higher accuracy?
The Variable to Model — Overcharge Ratio
Spanning two years, 2012–2013, the CMS dataset contains personally identifying information about healthcare providers and aggregated statistics concerning Medicaid claims, including:
- The amount providers billed Medicare
- The amount Medicare paid
- Which treatment the claim is for
- and much more.
If the fraudulent claims are consistently different from legitimate claims based on certain rules (even if unknown) then there might be patterns in the data that suggest such rules exist. One would expect, for example, that the adjustment amount — i.e., the amount Medicare pays minus what it is billed — should be higher for fraudulent claims than for legitimate claims. As noted, on average Medicare pays 0.37 of a claim — so the average adjustment downward, called the overcharge ratio, is 0.63 or:
Overcharge ratio = 1 — (total paid / total submitted) = 0.63.
So one rule to look for is that different claims have different overcharge ratios based on the amounts billed. Here we compare the distribution of overcharge ratios for the number of claims (CMS rows) overall versus by claim amounts:
The two graphs (left) do look very different. Higher claim amounts have higher overcharge ratios — suggesting perhaps a diminishing-returns rule for bills “inflated” above a certain point. Also, notice that both graphs become discontinuous at about the 0.2 mark suggesting another rule of some sort kicks in there. Although these rules aren’t known, there is evidence they exist and therefore potentially can be modeled to help distinguish fraudulent from legitimate claims. Further analysis reveals other evidence of hidden rules.
Note, for example, how much more two specific types of healthcare providers were paid in cases where Medicare made little or no adjustment to claims (overcharge ratios <0.2). Of the 20 types ranked, internal medicine and family practice overwhelmingly were paid the most — almost $750M versus less than $300M for the remaining 18 combined.
Another rule seems to distinguish individual providers by amount paid . . .
And even more so if, again, only low overcharge ratios are considered — 0.2 or less — as in this graph:
At $22M, Dr. Cockerill is the fourth largest receiver of Medicare, and yet $10M of his claims are barely adjusted at all by Medicare. All of the doctors on this list were paid large and largely uncontested sums by Medicare. Collectively the claims that overcharged less than 20% received a total of $3B.
What the Deep Learning Model Revealed
To exploit these types of underlying rules we created a Deep Learning model to predict the overcharge ratio from claims data, excluding how much Medicare paid. We used the following network architecture to train and test a neural network on the CMS dataset:
- Training Size: 10M
- Testing Size: 2.5M
- Input Neurons: 106
- Output Neurons: 1
- Hidden Layers: 3
- Hidden Units: 1024 X 1024 X 1024
- Run Time: ~ 10 hours
The model achieved an accuracy of 85.6% — confirming that a machine could in fact perform Medicare claims adjustment with this level of accuracy, representing $200B in reduced payments with significantly reduced personnel costs.
Displaying a random sample of claims data also shows how high adjustment claims do tend to cluster. That means that given a claim whose appropriate adjustment amount is unknown, we can estimate it with high accuracy by looking at its nearest neighbors:
This kind of analysis can be used to distinguish regular claims from fraudulent ones and can be applied over a specific region, as in Florida, here:
Again, we see that the high adjustment claims tend to cluster together. Given a new, unprocessed claim, we could place it in this visualization to see what company it keeps.
It’s Not Just About Medicare Fraud
But the model did more than just confirm that machine learning could adjust Medicare claims far faster and more accurately than current methods. It also confirmed that hidden rules often exist for distinguishing members of large datasets and that these rules can be modeled and applied even if not specifically known. Furthermore, it shows that there can be significant financial return in doing so — in business cases that range from fraud detection, to assigning credit scores, to market segmentation, and more.
The key is Deep Learning — or, more specifically, an ecosystem where the techniques of Deep Learning can themselves be robustly and economically applied in a practical way. Finding $200B only took 10 hours. And there’s much more where that came from.