ChatGPT for unlocking Outcomes from Clinical Trials

Amelie
Vitae Evidence
Published in
8 min readJul 9, 2023

Have you ever experienced the awe of witnessing breakthrough technology in action?

That’s exactly how I felt when I first encountered ChatGPT.

In the past few months, I’ve used chatGPT for various tasks, such as brainstorming ideas and rewriting text in a concise, compelling, patient-friendly, or clinician-friendly manner. I also used it to produce code I could copy/paste to create simple web applications, for fun.

I’m amazed by the breakthrough technology of ChatGPT. Its deep learning technique (Large Language Model, aka LLM) allows it to provide human-like responses to a wide range of natural language queries. The whole experience truly feels magical.

Beware that when asked for a list of references, chatGPT will provide very realistic references with confidence, but if you check them out, they may point to nothing related to your question.

Make sure to double-check.

AI Hallucinations are a well-known limitation of LLMs

What about ChatGPT for clinical research tasks?

At Vitae Evidence, we’re on a mission to harness the power of Large Language Models for matching clinical trials’ eligibility criteria to patients, as well as extracting outcomes data from past trials published in the literature.

We create software that helps doctors choose the right treatment for patients. We uses a type of artificial intelligence that can explain how it made its decision (Explainable AI). It looks at the patient’s medical history, certain molecules in their body (biomarkers), and the patient’s preferences.

Vitae Evidence AI process

We fine-tune pre-trained clinical LLMs, which means that we teach GPT-like models (and other transformers) to perform various tasks in the domain of personalised medicine. And then, we combine the results with other AI techniques to produce reliable, trustworthy, current, evidence-based and actionable results that aid the clinical decision. We comply with all the regulatory requirements of medical device software.

So, when chatGPT was released, I couldn’t wait to put it to the test and see if it outperforms our existing clinical LLM code.

Asking like a researcher

In the context of clinical trial outcome data extraction, ChatGPT has the potential to be a game-changer. By processing natural language queries, researchers can save time and resources in extracting specific information from vast amounts of clinical trial data.

Let’s try! After experimenting with various prompts to navigate past the safety net that cautions against providing healthcare advice, I received this response:

OK, so it could not give me statistics about a specific treatment. To be fair, I would have been really surprised if it had answered that question. At core, LLMs don’t understand what they process and unless they are paired with other tools, they cannot execute mathematical operations. Their training revolves around predicting the most probable next word, with the likelihood of the subsequent word heavily influenced by the dataset used for training.

What if I ask it to summarise important information from a trial?

Can I get chatGPT to summarize important information from a clinical trial?

Here is my first prompt,

and the answer I got:

Unfortunately, chatGPT generated fabricated content that appeared to be aligned with my query.

Mmm that was weird.

Asking chatGPT to verify — it admits the error.

This was not the proper trial at all!

Our experience with ChatGPT in clinical trials has been a mix of excitement and disappointment. We’ve experimented with various prompts and approaches to obtain accurate information, but it hasn’t always been straightforward. When asking about trials data, ChatGPT’s responses would seem so convincing, only to find out that they were completely unrelated to the queried trial.

It’s like being caught in a magician’s trick, where the illusion seems so real.

And if it makes up content so realistically and confidently, it makes me wonder: how can we trust it for anything else? Especially for important topics such as healthcare.

I’ve tried multiple queries and really, ChatGPT tries so much to please me, it has a tendency to provide deceptive answers, often pointing to unrelated information. These “hallucinations” can be both frustrating and misleading, but they also underscore the complexities of training AI models.

ChatGPT’s magic at producing convincing json

Let’s try another approach:

Can you extract the condition, molecular biomarker requirements, experimental interventions, comparator interventions, and outcome results data from NCT03499899 and answer using JSON format?

Wow! It looks so convincing. But then I read the clinical trial details on https://clinicaltrials.gov/ct2/show/NCT03499899 , which is really not about this trial.

It’s frustrating. It looks so convincing and it lies without shame.

Here again, it relates to how LLMs work: they are trained to guess the next likely word but have no real memory of the individual references they were trained on.

So, it’s not able to look-up a specific trial. What if I prompt it with the trial XML?

The prompt was too large to run, so that was not helpful.

Asking one bit at a time — extract biomarkers from the text listing the elligibility criteria

This looks better.

Making the prompt as precise as possible

I added to the prompt a copy of the elligibility criteria after the intructions:

Inclusion Criteria:

Had advanced (loco-regionally recurrent not amenable to curative therapy or metastatic) breast cancer
Had adequate bone marrow and organ function.
Had an Eastern Cooperative Oncology Group (ECOG) performance status of 0 or 1
Had measurable disease, i.e., at least one measurable lesion as per RECIST 1.1 criteria (Tumor lesions previously irradiated or subjected to other loco-regional therapy was to be considered measurable if disease progression at the treated site after completion of therapy is clearly documented)
Progressed after adjuvant or 1 prior systemic treatment in the metastatic setting. Patients with de novo metastatic disease were eligible if they received 1 prior line of therapy
Had received prior systemic treatment that included taxane-based chemotherapy for adjuvant or metastatic disease
Had a site of disease amenable to biopsy, and was willing to undergo a new tumor biopsy at screening and during therapy on this study, the latter if medically feasible. Patients with an available archival tumor tissue did not need to perform a tumor biopsy at screening if patient had not received anti-cancer therapy since the biopsy was taken.
Had histologically and/or cytologically confirmed diagnosis of advanced TNBC (based on most recently analyzed biopsy from locally recurrent or metastatic site, local lab) meeting the following criteria: HER2 negative in situ hybridization test or an IHC status of 0 or 1+, and ER and PR expression was <1 percent as determined by immunohistochemistry (IHC)

Exclusion Criteria:

Had received prior immune checkpoint inhibitors as anticancer treatment such as anti-LAG-3, anti-PD-1, anti-PD-L1, or anti-PD-L2 antibody (any line of therapy)
Received prior neoadjuvant or adjuvant therapy with a platinum agent or mitomycin and experienced recurrence within 12 months after the end of the platinum-based or mitomycin containing therapy or received Platinum or mitomycin for metastatic disease
Had major surgery within 14 days prior to starting study treatment or had not recovered to grade 1 or less from major side effects
Presence of CTCAE grade 2 toxicity or higher due to prior cancer therapy. Exception to this criterion; patients with any grade of alopecia were allowed to enter the study.
Had received radiotherapy ≤ 4 weeks prior to randomization (≤ 2 weeks for limited field radiation for palliation), and had not recovered to grade 1 or better from related side effects of such therapy (with the exception of alopecia)
Had a known hypersensitivity to other monoclonal antibodies, platinum-containing compounds, or to any of the excipients of LAG525, spartalizumab, or carboplatin
Had symptomatic central nervous system (CNS) metastases or CNS metastases that required local CNS-directed therapy (such as radiotherapy or surgery), or increasing doses of corticosteroids within the 2 weeks prior to first dose of study treatment. Patients with treated brain metastases would be neurologically stable and without CNS progression for at least 12 weeks prior to randomization and had discontinued corticosteroid treatment (with the exception of < 10 mg/day of prednisone or equivalent for an indication other than CNS metastases) for at least 4 weeks before first dose of any study treatment
Had clinically significant cardiac disease or impaired cardiac function

The Answer

It is pretty good for the molecular biomarker part!

Now, let’s try to get the previous treatments elligibility criteria only

Answer

The patient_progression_state seems correct. However, I will need to find a better way to explain how it should extract more precisely the previous treatment’s last dose date compared to trial start.

Conclusion

After trying various prompts, I’m confident that we can make chatGPT or GPT api helpful in extracting biomarker and treatment information from text. We would need to train it, though — giving it 200 to 500 training prompts per usecase just to get started.

However, OpenAI, the creator of ChatGPT, is currently not compliant with the EU AI act, and there are uncertainties surrounding the business model for developers investing time and money in teaching OpenAI models.

As a result, it may be safer to continue fine-tune other clinical transformers that can run on our own servers without sharing data with third parties. This ensures compliance, data security and IP protection while still leveraging the power of AI in healthcare.

I’m not giving up on openAI though — OpenAI has made using its APIs remarkably easy and enjoyable. With well-documented resources, intuitive interfaces, and comprehensive developer guides, OpenAI empowers developers to seamlessly integrate and leverage the power of AI in their applications. As long as your use case does not need to know the objective truth or validation of diverse perspectives.

Excited about the potential of using OpenAI for personalized medicine? Stay tuned for my upcoming post where I’ll delve into generating on-demand scientific overview using OpenAI.

If you have any questions or insights about OpenAI and personalized medicine, I’d love to hear from you in the comments section below. Let’s explore the future of healthcare AI together!

--

--

Amelie
Vitae Evidence

Digital health to facilitate integrated care and well-being | Digital Therapeutics, Precision Medicine, IoT, mHealth, UX