Using Digital Health Data to Improve Insurance Underwriting

Qingdi ZHAO
5 min readMay 26, 2021

--

Photo by National Cancer Institute on Unsplash

The pandemic of COVID-19 increases the risk of business and personal interactions, people’s life is shifting toward the virtual world. Suddenly caught up in this transformation, life insurers in the world are seeking information from different electronic sources to enable their capacity in risk evaluation as well as a digital customer journey. Electronic health data from applicants or insureds are coming to the fore.

Applicants’ and insureds’ electronic health data can be obtained from various sources and then digitally shared through mechanisms of health information exchanges (HIE). Health data can be either structured data, such as coded diagnosis (ICD) and lab testing results with standard values; or unstructured data, which is often composed of doctors’ notes and other documented information, such as visiting summary or radiology graph results. Compared with electronic health records that are generally generated in hospitals or clinics, some types of health data, such as pharmacy or lab visiting records or health risk related claims, are more widely available and easier to use, at the same time, the latter can provide more granularity to support the refined view of health and mortality risk.

Outside the medical field, the terms electronic health record (EHR) and electronic medical record (EMR) are often used interchangeably, but they are different.

  • EMR is the digitalized patient chart or record, it could be used by medical providers or hospital systems to track a patient. EMR often varies by different facility or provider. Although the industry is improving the interoperability of EMR, except by providing printed or certified copies, it is still difficult to share EMR between providers.
  • EHR includes one or more EMRs as well as more information about the patient, which could create a bigger picture for end-users than only using EHR. EHRs can be transmitted across systems and browsed or compiled by stakeholders, including the patient him/herself, in a patient’s health journey.

Based on the data from the Office of the National Coordinator for Health Information Technology and Health Information Management Systems Society (HIMSS), more than 90% of office-based physician practices and more than 95% of hospitals have adapted to using EHR products since the implementation of the EHR incentive program by Medicare & Medicaid Services, and 1.2 billion clinical documents are produced in the United States each year.

The Digital Opportunity in EHR

The health data information in EHR can create opportunities for life insurance underwriting. Just the same as EMR, EHR data can be presented in different ways, it could be structured or unstructured. Unfortunately, up to 80% of EHR data is unstructured, which proposes challenges to end-users. The unstructured data in EHR contain informative information in assessing health and mortality risk, such as type, grade, and location of specific cancer, or degree of blockage in a heart vessel.

Unstructured data creates gaps in the gathered information, because medical or service providers may document conditions or diagnosis in the notes while not coding that information in a structured way, and then cannot be presented as structured data, which could produce more obstacles in analysis.

The detailed data in an EHR is vast, including different coding systems. Structured data is easy to mine, but the volume of codes and various coding systems can still create challenges. In medical practice, the same condition, injury, or diagnosis can be coded in ICD-9, ICD-10, CPT, LOINC, and SNOMED-CT with different codes. ICD-10 contains more than 95,000 codes, which still change between updates and in different versions. For example, A code on the ICD-10 (2016) list can be different, eliminated, or merged with another code on the ICD-10 (2019) list.

When using EHR data in underwriting, there are also some other challenges, including the coding errors in the structured data, absence of required factors or data.

Elements of Automated Underwriting

EHR data can be provided in real-time, but many underwriters are still treating EHR information like traditionally static data, such as collecting physician statements and reviewing them manually. The automated underwriting system supported by EHR data should include a real-time system that contains below three components.

Automated Data Standardization

In the process of digitizing the underwriting process, automated data standardization can convert all medical coding systems in a given category, such as diagnoses or labs, into an unified single system. For example, the diagnosis codes in ICD-9, ICD-10, or SNOWMED-CT should be converted into one coding system, such as ICD-10. Automated data standardization will also remove any duplicated and redundant data, and standardize test units of lab results. For example, hemoglobin results can be expressed with different measure units: mmol/L, µmol/L, g/L, g/dL, g/100mL, g% and mg/mL, automated data standardization needs to convert them into one same unit which could be measured, compared, and analyzed.

Automated data standardization also transfers unstructured text fields into structured and coded output, at least to extract structured data from unstructured data. For example, when analyzing a doctor’s note containing the “Type 2 diabetes” or related words, the system can convert it to the ICD-10 (2019) code E11.

Automated Data Validation

Automated data validation applies defined rules to fill in missed data or correct inconsistencies or errors within the dataset. For example, when a patient shows a consistently elevated blood glucose level to the standard of diabetes diagnosis while no coded diagnosis for diabetes in the risk file, a rule could indicate there might be a missing diagnosis code. In assessing this kind of risk, the underwriter needs to re-evaluate the potential risk of diabetes which could increase the health and mortality risk.

Automated Scoring

An automated scoring system examines relevant risk elements and cross-checks with a digital underwriting manual to classify an individual’s health or mortality risk. The score could be numerical (debits, credits, and final score), or be categorical (standard, mild substandard, substandard, and non-standard), or a combination of both.

As an illustration, I put the revised underwriting process below.

Conclusion

Extracting the full value of EHR for automated underwriting is a long journey, and the challenges of developing such a system are also significant because the development plan involved staffing (mainly about the underwriting department and data department), data assets, and tremendous upfront investment. It is wise to divide the journey into steps, assemble the right resources, and obtain external help when it’s needed. In practice, insurers can use the correct vendors to develop it, but for the capable insurer, keeping it in-house might be a better choice. By developing the system, the result is significant, it brings sales growth, internal efficiency, and adaptability to a rapidly changing world.

--

--