Clinical Trial Matching at the Dana-Farber Cancer Institute

Ethan Siegel
Jan 10 · 4 min read

This article is the first in a series of posts describing how the MatchMiner project is used at the Dana-Farber Cancer Institute to enable more efficient matching of cancer patients to clinical trials.

Questions? Email us at

At the Dana-Farber Cancer Institute (DFCI), hundreds of clinical trials are being conducted at any given time. Many of the trials have complex eligibility criteria and it’s an ongoing challenge to find the right patients to enroll on the right trials.

While there are many commercial options that attempt to address this problem, DFCI chose to develop an in-house solution in order to have greater control over the data, design and overall functionality of the final matching software.

The result of this effort is the MatchMiner platform — an open-source, automated clinical-trial matching tool, developed by the Knowledge Systems Group, which algorithmically matches patients to clinical trials.

Here are some results three years after implementing MatchMiner:

  • Integration with Epic Hyperspace: With a single click, oncologists can find trial matches for any patient who has received next-generation-sequencing (NGS) embedded in the medical record.
  • Daily trial matching of 26,000+ patients who have received next-generation-sequencing through the Profile project at DFCI.
  • Daily trial matching against 270+ precision medicine targeted therapy clinical trials using specific clinical and genomic criteria.
  • 100+ trial enrollments to targeted therapy clinical trials seeking patients with specific genomic alterations.
  • Creation of Clinical Trial Markup Language (CTML), a formal open-source specification for describing clinical trial eligibility criteria in a machine-parsable manner suitable for automated trial matching.
  • 1,200+ molecular tumor board reports generated for the Gastrointestinal Cancer Treatment Center at DFCI.

Before diving into the details of the system, it’s worth understanding the problem which MatchMiner attempts to solve. This problem can be stated quite simply: there’s a huge amount of data to analyze to determine if a patient is eligible to enroll in a clinical trial, and not enough time to do it.

Each individual cancer contains unique genetic alterations, and tumor DNA sequencing tests are a common way to identify those alterations. Sometimes the alterations which are discovered through a DNA sequencing test can be used to help determine a treatment plan, or eligibility for a clinical trial. [1]


While genomic testing can be highly informative for determining therapeutic options for a patient, the reports resulting from these tests can be long, complicated and difficult to interpret, even for a highly trained oncologist. Compounding this, oncologists often have a limited amount of time to spend with each patient (~20min, [2]), hampering their ability to find and interpret the relevant information in these long reports.

Clinical trials add another layer of complexity to interpreting genomic data. Trials often include genomic alterations as eligibility criteria for enrollment, but while clinical trial eligibility criteria is generally made publicly available via, not all trials publicly provide the full set of genomic eligibility. Additional detailed eligibility may only be available within the institution running a particular trial, stored in a Clinical Trial Management System (CTMS). Whether through or a local CTMS, genomic eligibility for a trial may not be easily searchable. In addition, eligibility guidelines for a trial can evolve as preliminary results are examined. For all of these reasons, keeping current on clinical trial options for patients can be a challenge for a physician and his or her team.

One of the biggest challenges involved in the MatchMiner project was getting access to and building a system to ingest the relevant genomic, clinical and trial data, and then transform it into a format suitable for automated, algorithmic matching.

In the next post, we will describe some of the ways the MatchMiner system structured trial data in order to make it suitable for trial matching.

Much of the content for this article is adapted from this article first published in the bioRxiv.

Interested in setting up a local instance? Email us at

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade