Difficulty Of Identifying Diabetic Patients As Diabetic

Yubin Park
accordionhealth
Published in
5 min readJul 21, 2016

Risk adjustment has become arguably one of the most important yet complex operational program for modern payers. Every year, payers invest millions of dollars on the risk adjustment program, and still some payers lose million dollars from it [1,2]. Due to this unpredictable nature, risk adjustment sometimes has been referred to as “legalized gambling” [3].

So, what is risk adjustment by the way? Risk adjustment is a statistical process of assessing enrollees’ current or future medical needs based on their current conditions, and adjusting the payments accordingly. The premise is that sicker enrollees need more medical services and cost more than healthier enrollees. Thus based on the assessed risk scores, the government adjusts the overall payments to payers: higher the risk score, more the payment, hence the process is called “risk adjustment”.

The first (and the most critical) step of risk adjustment is the assessment step. At this point, some people may ask: “How difficult can it be to assess the risk of your population? Don’t payers know about their enrollees’ current conditions?” The short answer to this question is: “Yes, it is very difficult”. To see why it is so difficult, one needs to understand the multi-faceted ecosystem of healthcare payers and providers, and even more complex life cycles of healthcare data. However, illustrating all aspects of these issues is almost impossible to condense in one blog post. Thus, we would like to give you a teaser of some aspects by using diabetes as an example.

Identifying Diabetic Patients Using Available Data

If you are a data wrangler in a payer organization, the first source of data you will touch is claims data. Claims data, submitted by providers for reimbursement purposes, contain diagnosis and procedure codes and other billing related variables. From claims data, one can easily find diabetes-related diagnosis codes (e.g. 250.00 in ICD-9), and may think that enrollees with such codes are diabetic. This easy and intuitive approach, however, has several flaws. Although the information about procedure and prescription codes in claims data are fairly accurate, which are useful for assessing care qualities [4], as described by E. Fisher et al in their seminal paper [5], the diagnosis codes in claims data do not perfectly reflect the diagnoses in medical records.

Medical records, often in a form of Electronic Health Records (EHRs) nowadays, tend to have a more complete set of diagnoses. When claims are translated from medical records, some codes can be omitted for various reasons [6]. Due to this fact, many payers spend lots of money to retrieve and review all the medical charts to see if they have missed any diagnosis codes. Does it seem like the problem is solved? Although we really wish that this is the ultimate solution for the risk adjustment, unfortunately we are merely scratching the surface of the problem. There are, in fact, a large number of diabetic patients who do not appear to be diabetic from their medical records alone [7,8].

Data is always a partial representation of the world, and healthcare data is not an exception. Claims, EHRs, laboratory results, and prescription data represent the truths for the purposes of how and why they are created. To define what is diabetes and who are diabetic using available data sources, we need to properly combine all these data sources and derive comprehensive rules for how we can define the “phenotypes” of diabetes.

In [9], R. L. Richesson et al describe eight different ways of defining diabetic phenotypes (see the table below). Each phenotype definition uses different sources of data. For example, the CMS CCW definition uses only claims diagnoses codes, while eMERGE uses a combination of medical claims, prescription claims, and EHRs. Remarkably, each definition results in a different subset of population (for more details, please read the actual paper). The authors state that “phenotype definitions with multiple components are required for a comprehensive definition for diabetes”.

8phenotypes

Automatically Discovering Diabetes Phenotypes

If you don’t want to miss any diabetic member in your population, you probably need to apply multiple diabetes phenotype definitions on your available data. The results will be more comprehensive as you add more definitions, and you can accurately identify real diabetic patients among the filtered population by those rules.

Wait! Are those eight definitions comprehensive enough? If you are on the same page as we are, you may be asking these questions. And if you have read our other blog posts, probably you may already have guessed where we are heading: “Enter Machine Learning”. People often ask how our risk adjustment strategy is different from others, and here through this blog post, we reveal some of our secret sauces for the first time.

At Accordion Health, we developed machine learning software to discover the phenotype definitions [10,11,12] rather than manually discovering and defining those patterns. Our algorithms do this job by analyzing millions of medical claims, prescription claims, EHRs, lab data, and eligibility data. They automatically extract meaningful phenotypic patterns, test the statistical significance of the patterns, and if verified, those patterns are saved — creating millions of different patterns. Some patterns are very obvious, and some patterns can be somewhat counter-intuitive. For the latter cases, our Chief Medical Officer provides her expert opinions and our engine evolves using this feedback.

Thus far, our risk adjustment engine, ARISE, has analyzed hundreds of thousands of members: ingesting their medical records, billing information, and lab data. Everyday, ARISE finds a new set of patterns for all the condition categories that are used in the HHS- and CMS-HCC models. With these ever-growing patterns, ARISE will allow you to comprehensively identify risk conditions much earlier and cheaper.

For more information about our risk adjustment secret weapon, please contact at info [at] accordionhealth [dot] com.

Appendix

An example of a diabetes phenotype pattern [13].

F1.large

References

  1. https://www.gormanhealthgroup.com/blog/2016/07/07/2015-risk-adjustmentreinsurance-payments-published/
  2. http://www.centaurihs.com/blog/solutions-for-risk-adjustment-and-reinsurance/
  3. http://www.centaurihs.com/blog/the-gambler/
  4. http://www.rand.org/pubs/rgs_dissertations/RGSD171.html
  5. http://ajph.aphapublications.org/doi/pdf/10.2105/AJPH.82.2.243
  6. https://www.optum.com/content/dam/optum/resources/whitePapers/Benefits-of-using-both-claims-and-EMR-data-in-HC-analysis-WhitePaper-ACS.pdf
  7. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1855339/
  8. http://www.ncbi.nlm.nih.gov/pubmed/11814171
  9. http://jamia.oxfordjournals.org/content/jaminfo/20/e2/e319.full.pdf
  10. http://dl.acm.org/citation.cfm?id=2623658
  11. http://sites.duke.edu/rethinkingclinicaltrials/ehr-phenotyping/
  12. https://phekb.org/
  13. http://jamia.oxfordjournals.org/content/early/2015/09/03/jamia.ocv112

--

--