Historical Data for Control Arms of Clinical Trials

“high-raise photography of library” by Max Langelott on Unsplash

Those who cannot remember the past are condemned to repeat it.
 — George Santayana

In our last blog post, we talked about why clinical trials are such a bottleneck in the drug development process. Clinical trials take a long time, eating into the patent lifetime of a therapeutic. Why do they take a long time? One big reason is that enrollment is a real challenge. Finding prospective patients and enrolling them is laborious and time consuming. For a major drug, the cost of running a phase III trial can exceed $250 million [1], but the cost of lost time on patent and foregone sales revenue can easily reach billions of dollars a year.

How can this situation be improved? The easiest way to shorten the enrollment period is to enroll fewer patients, but that alone would decrease the statistical power of the trial. Instead, if we reduce the number of patients in just the control arm and replace those patients with historical control data, we can maintain power and shorten enrollment time. This innovative trial design helps useful drugs reach the market faster and allows a greater percentage of patients participating in clinical trials to get what they want and receive potentially life saving experimental treatments. Next, we will examine why this innovation is a good idea in general, and how it can be safely put into practice.

Control arms are a sticking point for patients involved in clinical trials. When a patient enrolls in a trial, they want to be part of the treatment arm so that they can benefit from a potentially transformative therapy. The risk of ending up in the control arm is a significant deterrent for participation; approximately 30% of patients declining to participate in trials express concern over receiving a placebo [2,3,4]. In disease areas without a lot of viable therapies, these control arms are effectively run over and over, in trial after trial. To test whether a particular drug should be put on the market, we frequently don’t need any more control arm data. Historical data is enough.

The fact that historical data can serve as a valid stand-in for a control arm in a trial is well known, and natural history models are regularly used in trials. In disease areas such as oncology with low life expectancies, running control arms can be unethical if they deny patients potential life-saving experimental therapies during a clinical trial [5,6]. In rare diseases, patient enrollment may be so challenging that historical data is a necessity [7,8]. Finally, in disease areas such as neuroscience where few new therapies have been approved, historical data can be ample and well suited to current trials and save time and resources.

Accurately deploying historical data in a clinical trial requires care. The primary concern is that errors can be inflated due to bias in the historical dataset relative to the current trial. Bias between current and historical data can come from many sources: from a covariate, from different trial designs, and from the evolution of the standard of care over time.

A recent paper [9] presents a nice example highlighting how historical data must be properly used. Data from two past metastatic colorectal cancer (mCRC) trials were compared to data from a more recent mCRC trial. Comparing the historical controls to the treatment arm of the newer trial, an apparently significant effect is observed. However, no effect is seen when comparing the treatment and control arms of the newer trial. Taken on its face, the historical data would erroneously lead to the conclusion that the treatment is effective.

Comparing the trials more closely, the distribution of a covariate (a patient’s duration on the chemotherapeutic agent oxaliplatin) differs between the historical controls and the recent trial. Accounting for this covariate corrects the discrepancy, and the effects are consistent across trials. The overarching lesson is that historical data must be carefully matched to the trial data and design, which is a focus of statisticians developing innovative trial designs [10,11].

Innovative trial designs are not only better for patients and trial sponsors, they are also encouraged by regulators. Beyond extensive statements of interest in this area, the FDA has recently begun active support of natural history studies, made budget requests for funding to further invest in natural history models, and launched a pilot program in innovative trial design [12,13,14]. The increasing challenges of drug development combined with the openness and evolution of regulators suggest that it is inevitable that natural history models will play a critical role in future clinical trials.

As we’ve seen, natural history models offer many advantages in situations where enrolling patients in control arms is inadvisable. But we must also recognize that historical data is not a panacea and has its own limitations. Using historical data requires understanding the differences between the populations of trials and their designs. Biases can develop as the standard of care or general health of the patient population evolves. Often, modeling or augmentation of the historical data is required to make that data comparable to a particular trial.

More extensive modeling is one way to improve the utility of natural history models. By using available data to try and predict more about patients, including secondary endpoints or additional covariates, we can build more powerful, realistic, and flexible models of clinical trials and better understand their limitations. In the next post in this series, we’ll discuss comprehensive modeling of patients in more detail.