Unsexy but Essential: LLMs are Revolutionizing Document Processing in Healthcare

Manny Bernabe

Published in

Ushur Engineering

6 min readJul 11, 2024

Ushur’s AI driving real business impact

Co-Authors: Manny Bernabe, Chintan Gotecha and Vrajesh Sejpal

Image generated by DALL-E 3 via ChatGPT, OpenAI, July 11, 2024.

LLMs are the core technology behind popular AI applications like ChatGPT. These engines have been optimized to provide finely-tuned, high-quality AI assistance. Yet, the same foundational technology — large language models — can be adapted for a variety of tasks that may initially seem mundane but are crucially impactful. The example I’m about to discuss may not sound glamorous — it involves a lot of what many would consider tedious work. However, it’s precisely these kinds of tasks that hold substantial economic value for the businesses we’re aiding, showcasing the broad potential of LLMs beyond the buzzworthy applications in image generation and voice interaction.

At Ushur, we work with various healthcare providers and payers, managing numerous orders from doctors’ offices for specific medical procedures. Their task is to receive these orders, often via fax, verify them, and then direct them to the correct local vendor. The reliance on faxed documents, particularly prevalent in the U.S. healthcare system, may seem outdated, yet it remains a stark reality. Personally, I’ve experienced the challenges this system presents, such as when I retrieved a box full of disorganized documents for my father’s medical procedure — a cumbersome and all-too-common scenario.

What LLMs bring to the table for this provider is the capability to streamline this convoluted process. We help them by aggregating the orders, extracting pertinent information, and integrating this data into their records systems and a user interface for actionable insights.

Why This is Hard for Traditional AI Methods

This task is complex due to the diversity of document formats, the quality of document scans, handwritten notes, the addition of lists and checkboxes, and the critical need to maintain patient confidentiality — not to mention the infamous doctor handwriting. The orders we process might be buried within multi-page documents, surrounded by irrelevant data that still requires careful handling due to the inclusion of sensitive information.

The traditional methods of Optical Character Recognition (OCR) and Natural Language Understanding (NLU) struggle here for several reasons. Firstly, the data is highly disorganized and inconsistent, making it difficult to apply standard classification methods effectively. A lot of this data comes from tables in the document, which need to be extracted and presented in a more natural language format. Some of these tables are not clean; they are multi-column in nature and have different sub-columns. There is a lot of variation in these tables, making it hard to treat them consistently using traditional natural language processing techniques. Secondly, examples for certain classifications, like workers’ compensation claims, are too sparse for traditional NLU methods to train on effectively.

This is where we leverage the strength of LLMs, optimizing and tailoring them for the use case. They provide flexibility that allows us to fine-tune models with smaller, less conventional datasets. By employing a strategic mix of prompts and an array of fine-tuned models, we achieve accurate classifications and streamline the data extraction process.

LLMs allow us to apply contextual awareness before performing a classification. For instance, if there are two different types of labels, say a biopsy report versus a blood work lab report, a traditional classification method might be confused by just the underlying text in that report. But with LLMs, we can get context from the broader document to lean towards one way or another. This contextual awareness is a really powerful and handy tool in these LLM implementations.

Here’s Our Process and What We’ve Learned

For a healthcare service provider, here is the workflow:

Receiving Fax Documents: The initial reception of various healthcare documents via fax, often very poor quality scans.
Grouping and Segmenting: Organizing these documents into coherent groups for easier processing.
Classifying: Utilizing tailored LLMs at Ushur to categorize documents into predefined classes, such as order forms, patient demographic data, identification information like driver’s licenses and passports, lab reports, insurance authorizations, and clinical notes.
Extracting Key Value Pairs: Pinpointing and extracting essential data from the classified documents.
Pushing Downstream for Action: Integrating the structured data into systems like Salesforce for actionable business insights and operations.

In Ushur’s IDA (Intelligent Document Automation) stack, the text is initially extracted via the OCR engine, and then a slew of fine-tuned proprietary LLMs are used for classification and extraction of data. Our models achieve up to 96% accuracy on unseen samples, up from 20% with our earlier traditional AI approaches.

Key Learnings and Findings

In our journey of integrating LLMs into healthcare document processing, we’ve discovered valuable insights about when to use third-party LLMs versus fine-tuning our in-house models. Here’s what we’ve learned:

When to Use Third-Party LLMs

Third-party LLMs shine in scenarios where tasks are straightforward to explain. These models are perfect when you need decent accuracy right out of the box without extensive training data. This is especially useful in the early stages of a project when requirements are frequently changing. Managing annotated datasets during these frequent changes can be time-consuming and costly. For quick proof of concept, third-party LLMs allow us to get a system up and running swiftly, providing full functionality and decent accuracy without significant development effort. Additionally, their per-token costing mechanisms are cost-effective for workloads that vary in intensity.

Advantages of fine-tuned LLMs

On the other hand, fine-tuning our in-house LLMs is beneficial when we need better performance than what we’re getting from third-party LLMs. The nice thing about this is that you don’t need much more data to get a meaningful lift in performance. Sometimes as little as 15 additional samples for a particular class can be helpful, and we have found this to be an additional benefit in terms of using LLMs versus traditional methods. Historically, you needed a lot more labeled data to train with those traditional methods. Now, with LLMs, you don’t need much more labeled data to get the uplift. That is hugely beneficial, especially when it comes to managing our relationships with our customers and clients. We have found that many large customers do not or cannot provide an extensive amount of historical labeled data. So, the less of this we require, the more beneficial it will be overall, and these LLMs allow us to do more with fewer labeled data sets.

That said, Ushur’s pre-trained fine-tuned models provide excellent out-of-box performance. In some specific cases, we have enabled some customers to fine-tune these LLMs with their own enterprise data when available or when the use case calls for it.

Fine-tuning allows us to optimize the models to reduce latency and operational costs, especially by minimizing input tokens. Paying for infrastructure rather than per request can be more economical, depending on the workload and utilization.

Moreover, fine-tuning gives us finer control over the model’s output, allowing us to achieve higher accuracy for specific tasks with relatively little new training data compared to traditional AI methods. An additional benefit is the ability to host fine-tuned models in-house, which is crucial for highly regulated industries with strict privacy and compliance concerns.

However, fine-tuning requires access to data, which isn’t always available. Synthetic data generation might not accurately represent real-life use cases. In such cases, instead of halting progress due to a lack of training data, we can start with prompt-based approaches using very large pre-trained models. These models, relying on their extensive training, can deliver decent accuracies and provide a solid starting point.

What’s Next

Next, we aim to focus on continual improvement in efficiency and accuracy and on developing cost-effective scaling strategies. Finally, we will augment our stack with multi-modal LLMs (MLLMs) that leverage the strengths of both LLMs and large vision models (LVMs) for enhanced document understanding.

Conclusion

The integration of LLMs into healthcare document processing is more than a technical upgrade — it is a transformative shift towards more efficient, secure, and reliable healthcare automation. As we continue to refine our processes and expand the capabilities of our models, we are setting new benchmarks for operational excellence in the healthcare sector.

Stay tuned for updates as we dive into other challenges such as scaling strategies and more about handling sensitive medical information in upcoming discussions.