Predicting clinical trial risk with AI: elevating efficiency in healthcare through natural language processing

Exploring the Application of AI and NLP in Improving Clinical Trial Risk Assessment and Reducing Failure Rates.

Thomas Wood
Fast Data Science
3 min readDec 1, 2023

--

The Role of AI & NLP in Clinical Trial Risk Assessment

Clinical trials pave the way to scientific breakthroughs, but approximately 90% end in failure. A subset of these failed trials can be categorised as ‘uninformative’. This scenario occurs when the design, implementation or report of a clinical trial is done in a way that prevents it from delivering scientifically valid information. As a result, valuable resources are lost, and the ethical implications of involving human subjects in futile trials become a critical concern.

To address this problem, Fast Data Science developed the Clinical Trial Risk Tool, an AI-powered tool that leverages NLP to detect potential risks of uninformative results in clinical trials. If your organisation needs a tailored AI strategy to leverage machine learning for healthcare projects, feel free to reach out to us.

The Aspects of an Informative Clinical Trial

In order to be scientifically informative, a clinical trial must meet the following conditions:

  1. The hypothesis addresses an unresolved scientific question of significance
  2. The study design allows for the gathering of significant evidence pertaining to the question
  3. The study is feasible, i.e., the recruitment of necessary participants is attainable
  4. The study is conducted with scientific rigour
  5. The study’s results are reported accurately, completely, and promptly.

The Role of AI & NLP

Identifying potential problematic components of a clinical trial protocol — such as having an inadequate Statistical Analysis Plan (SAP), planning to recruit an insufficient number of participants, or being unsure about the expected effect size — requires a significant investment of expertise and resources.

This is where AI, specifically natural language processing (NLP), can assist. NLP is capable of identifying key points in the document, drawing the attention of human experts to those sections, and quickly triaging and flagging potential risks.

Clinical Trial Risk Tool Workflow

We built the Clinical Trial Risk Tool in collaboration with domain specialists. Factors that we considered included: pathology, presence of a SAP, whether the effect estimate has been stated, the number of subjects and arms, the countries of execution, and the use of simulations in determining the sample size.

The tool was designed to use an ensemble of rule-based, machine learning (random forest) and neural networks models to compute a score between 0–100 based on these factors. Following the score, it also categorises the trial as HIGH, MEDIUM, or LOW risk.

The tool has been made accessible via a web interface, allowing users to upload a trial protocol in PDF format. It then presents the risk score and level to the user, simplifying the workflow of a clinical trial reviewer.

Open Source Contributions

To foster a wider utilisation of the tool, Fast Data Science has open-sourced the project on Github under the MIT licence. This allows other researchers and developers to modify, extend and improve the tool to meet their specific needs.

Future Directions

The team is exploring ways to extend the functionality of the Clinical Trial Risk Tool. Possibilities include adding more pathologies and locations, estimating trial complexity, cost, or other parameters besides risk, and continuously refining the AI model based on user feedback and updates in clinical trial methodologies.

For more information on how AI and NLP can facilitate clinical trial risk assessment, visit https://fastdatascience.com/how-can-we-assess-the-risk-of-a-clinical-trial-using-ai.

--

--

Thomas Wood
Fast Data Science

Data science consultant at www.fastdatascience.com. I am interested in all things AI and natural language processing. www.freelancedatascientist.net