crafting the future of healthcare with LLMs

marta g. zanchi
nina capital
Published in
8 min readDec 2, 2024

thoughtful design for meaningful impact

DECEMBER 2024

by Nadin Youssef

Artificial intelligence's rapid development is transforming several industries, but perhaps none more profoundly than healthcare. In healthcare, Large Language Models (LLMs) have emerged as a potentially revolutionary tool that could address many of the global challenges facing our healthcare systems. Today, the “back-end” of healthcare workflows and processes is largely managed manually and is, therefore, extremely time-consuming to maintain. Inevitably, it is also often prone to error. At the same time, our healthcare systems around the world continue to slowly deteriorate — patients are paying more out of pocket than ever before, there are workforce shortages on top of healthcare professionals working at full capacity, budgets for innovation are scarce, and waiting lists for care continue to grow. And so, in this bleak state of affairs, the emergence of LLMs has lit a fuse in the healthcare innovation landscape. Could they be used to make administration more efficient? Or to summarize the pages of clinical notes that physicians have to go through when admitting new patients?

Unfortunately, like any emerging technology, LLMs hold great promise but also significant limitations that could lead to patient harm. Today, researchers and innovators are working hard to understand what these are.

what are Large Language Models and how do they differ from traditional AI?

Large Language Models (LLMs) — such as GPT-4, BERT, Gemini, and LaMDA — represent a significant evolution in artificial intelligence, leveraging deep learning architectures to perform complex and versatile tasks. Unlike traditional AI models, which are purpose-built to perform narrowly defined functions, LLMs are trained on vast datasets encompassing diverse sources such as the internet, scientific literature, and articles. This extensive training enables them to undertake a wide range of tasks, including summarizing extensive medical literature, supporting clinical decision-making, and serving as virtual care companions.

the distinction between LLMs and traditional AI

The primary distinction between LLMs and traditional AI lies in their scope and adaptability. Traditional AI models in healthcare are designed for specific, well-defined applications. For example, an AI model in radiology may be trained to detect anomalies in medical imaging, or another model may analyze structured electronic health record (EHR) data to predict patient risk for sepsis. These systems excel within their specialized domains but lack the flexibility to generalize beyond their training.

In contrast, LLMs possess remarkable generalizability. By training on diverse and unstructured datasets, they can generate coherent and contextually relevant text, making them highly adaptable to various applications. This flexibility allows LLMs to excel in areas such as:

  • Medical Documentation: Automating and streamlining clinical notes, reducing administrative burdens for healthcare professionals.
  • Patient Communication: Crafting personalized, empathetic responses to patient inquiries and supporting health literacy.
  • Research and Education: Summarizing complex medical studies or generating educational content for clinicians and patients.

why generalizability matters in healthcare

The generalizability of LLMs opens new possibilities for integrating AI into healthcare workflows. While traditional AI remains indispensable for tasks requiring precision and narrowly focused expertise, LLMs provide a complementary layer of functionality. For example, they can assist in synthesizing insights across disciplines, enabling clinicians to stay updated with the latest advancements or aiding patients in understanding complex medical conditions.

However, with this generalizability comes a responsibility to ensure these models are deployed safely and ethically, particularly in a field as sensitive as healthcare. Challenges such as maintaining data privacy, minimizing bias, and ensuring explainability must be addressed to fully realize the potential of LLMs in enhancing patient outcomes.

Uses of LLM-based technologies we’ve seen across digital health

This year, we at Nina witnessed a large number of companies, over 50 just this quarter, using LLMs across various use cases. The majority fall into four key buckets:

  1. Clinical decision support: LLMs that synthesize large volumes of unstructured medical literature or clinical guidelines to help clinicians stay current. Most of these companies offer products to hospital systems, primary care clinics, and specialty clinics that enable physicians to save time sifting through disparate sources of local, regional, or centralized guidelines for patient care.
  2. Medical documentation and billing: LLMs excel at processing and generating natural language, making them very useful for extracting text and automating tedious tasks such as medical transcription, clinical documentation, and coding. Many of these companies offer products to third-party revenue cycle management companies, large hospital systems, or independent clinics to streamline their billing processes, increase administrative efficiency, and capture reimbursements that staff may have missed.
  3. Patient engagement and education: LLMs that can answer patients’ questions about their condition to provide guidance or improve patient engagement by delivering timely notifications or information. Interestingly, we found that these companies mostly sold to providers managing mental health or chronic disease patients to ensure patient engagement and better health outcomes.
  4. Drug discovery and genomics analysis: LLMs based on scientific literature and genomic data that enable improved identification of promising drug candidates, drug targets, or potential genetic markers in a population. These companies cater to biotech and pharmaceutical companies to streamline their workflows, which could accelerate preclinical research and development.

so, what’s the hold-up?

Despite their transformative potential, Large Language Models (LLMs) face several significant limitations that could hinder their widespread adoption in healthcare, particularly in patient care settings. Numerous studies have highlighted critical flaws that must be addressed before these technologies can be safely and effectively deployed.

1. Accuracy and Reliability

LLMs are not infallible; in fact, they are prone to generating errors or “hallucinations.” This can lead to incorrect or misleading outputs, particularly in ambiguous or complex medical scenarios. A study published in Nature Medicine revealed that LLMs struggled to accurately diagnose patients, adhere to treatment guidelines, or interpret laboratory results correctly. Similarly, research from Harvard Medical School assessed an LLM’s performance in responding to patient messages and found several instances of safety errors. In one alarming case, the advice provided could have been fatal if acted upon. Such instances underscore the critical need for rigorous testing and validation before LLMs can be entrusted with patient care.

2. Bias and Fairness

Bias in healthcare AI is a pervasive issue, and LLMs are no exception. These models inherit biases from the datasets they are trained on, which can exacerbate health disparities and compromise equity. A Stanford study tested four commonly used LLMs and found troubling results: all four models promoted race-based medicine when tasked with scenarios like calculating lung capacity for Black patients or estimating glomerular filtration rates (eGFR) for Black women. Another study evaluated LLMs in predicting hospitalizations, costs, and mortality and revealed stark discrepancies. For white populations, LLMs predicted higher costs, longer hospitalizations, and more optimistic prognoses compared to populations of color. These findings raise serious concerns about fairness and equity, as biased algorithms risk further marginalizing vulnerable groups.

3. Data Privacy

Healthcare data is uniquely sensitive, and safeguarding patient privacy is paramount. The training of LLMs on large datasets, including medical records, introduces risks of data breaches, unauthorized access, and even re-identification of anonymized patients. The challenge lies in balancing the utility of these models with compliance to stringent privacy regulations such as HIPAA in the U.S. or GDPR in Europe. Developers, innovators, and policymakers are currently working to establish best practices for data handling and privacy standards in healthcare-specific LLM applications.

4. Limited Explainability

Often referred to as “black boxes,” LLMs lack transparency in their decision-making processes. This opaqueness makes it difficult for clinicians and regulatory bodies to trust these models, particularly in high-stakes environments like healthcare. For instance, studies have found that even when LLMs generate accurate answers, their explanations are often flawed or nonsensical, demonstrating a lack of true understanding. This mistrust poses a significant barrier to adoption. In response, some AI companies are positioning themselves as leaders in “explainable AI,” aiming to differentiate themselves by offering transparency to build trust among clinicians, investors, and end users.

Among these limitations, accuracy and reliability are expected to improve over time, as evidenced by the leap in performance between GPT-3 and GPT-4. However, addressing the deeper challenges of bias, fairness, explainability, and data privacy will require coordinated efforts across the healthcare spectrum. Innovators must prioritize building models that are not only effective but also equitable, transparent, and safe.

These challenges are not insurmountable, but they demand collaboration among stakeholders — clinicians, developers, regulators, and investors alike. With a shared commitment to ethical and patient-centered innovation, we can work toward LLMs that truly enhance healthcare outcomes without compromising trust or equity. The promise of LLMs in healthcare is clear, but realizing their full potential requires a deliberate, multidisciplinary effort.

embracing the transformative potential of Large Language Models in healthcare: a call to action

LLMs are not merely a technological milestone; they represent a profound opportunity to redefine healthcare delivery and innovation. From aiding clinical decision-making and optimizing administrative workflows to fostering patient engagement and expediting research and drug discovery, the transformative potential of LLMs spans the entire healthcare ecosystem. However, to harness this potential responsibly, strategic and collaborative action is imperative.

For Investors: Prioritizing Safe and Ethical Innovation

The investment landscape in healthtech is at a pivotal juncture. Now is the time to back startups and initiatives that champion the safe, ethical, and transparent deployment of LLMs in healthcare. The most promising ventures we’ve observed focus on solving targeted, well-defined problems while emphasizing explainability, utilizing diverse training datasets, and implementing robust privacy-preserving mechanisms. These attributes are not just desirable but essential for the long-term viability and societal acceptance of LLM-based technologies in sensitive domains like healthcare.

For Innovators: Co-Design and Regulatory Integration

Healthtech innovators must prioritize co-design with clinicians to ensure that their solutions address real-world clinical needs and can be seamlessly integrated into existing workflows. Collaboration with regulatory experts is equally critical, as the development of LLM-based tools requires compliance with stringent healthcare regulations. Advocating for regulatory sandboxes — such as the UK’s Medicines and Healthcare products Regulatory Agency (MHRA) AI sandbox or the FDA’s PrecisionFDA platform — can facilitate responsible experimentation and clinical validation, offering innovators a structured pathway to bring their technologies to market.

For Clinicians and Healthcare Organizations: Shaping the Future Through Collaboration

Clinicians and healthcare organizations should proactively partner with technology developers to shape and pilot LLM-based tools. Early engagement ensures these technologies align with clinical realities and user needs. Controlled pilot programs can enable organizations to remain at the forefront of digital health innovation while setting best practices for the ethical and effective use of LLMs in clinical environments.

Contextualizing the Role of LLMs in Clinical Settings

It is crucial to recognize that LLMs are likely to excel in certain areas while requiring augmentation in others. For instance, a recent randomized clinical trial involving 50 physicians assessed LLMs’ impact on diagnostic reasoning. The results revealed that while LLMs performed better than human physicians when diagnosing cases independently, they did not significantly enhance diagnostic reasoning when used as supplementary tools. These findings underscore the importance of strategic integration, where LLMs complement rather than attempt to replace human expertise.

Human-in-the-Loop vs. AI-in-the-Loop: Rethinking the Approach

The conversation around human-in-the-loop models in healthcare is vital for mitigating risks to patient safety. However, we must also consider the inverse: integrating AI into human workflows as an “AI-in-the-loop” approach. By designing LLM-based tools with this intent from the outset, we could potentially reduce bias, enhance explainability, and improve integration into clinical settings. This shift in perspective could lead to the development of LLMs that not only solve problems more effectively but also align more closely with the needs and workflows of healthcare teams.

A Collaborative Path Forward

The promise of LLMs in healthcare is clear, but their successful implementation requires coordinated efforts across investment, innovation, regulation, and clinical practice. By committing to ethical, patient-centered, and need-driven innovation, we can leverage LLMs to create healthcare systems that are more responsive, inclusive, and effective. The journey ahead demands collaboration, foresight, and an unwavering focus on delivering meaningful improvements to patient care.

--

--

nina capital
nina capital

Published in nina capital

nina capital is a new venture capital firm investing at the intersection of healthcare and deep technology.

marta g. zanchi
marta g. zanchi

Written by marta g. zanchi

health∩tech. recognizing the need = primary condition for innovation. founder, managing partner @ninacapital

No responses yet