# The Advantages of Modeling Clinical Data for Control Arms

In this series of posts, we are exploring the challenges of clinical trials and how historical data and computational models can be used to make those trials more efficient. Patient recruitment is the biggest bottleneck for many trials [1]. Using carefully cultivated historical data in place of real patients in the control arm of a clinical trial can reduce recruitment time and time to market [2]. This post will describe how computational models can improve the utility of historical data in clinical trials.

Consider a simple example. Suppose we want to run a trial for a new treatment on patients between the ages of 65 and 75, and we have historical control data from an earlier trial on patients between the ages of 60 and 70. Although the control populations are similar, they are clearly not the same. Can we still use the control arm data from the first trial to inform the new trial?

Clearly, borrowing data from a historical trial is good practice if both study populations are identical. Requiring the historical study population to exactly mirror our current trial’s design and population is impractical, but — with the help of a model — we can use similar historical data to predict the endpoint for the current control arm without sacrificing statistical rigor [3].

To use the historical control arm data in our example, we first need to understand what effect the age covariate might have on our trial endpoint. Once we can measure the effect, we can control for any differences between the historical and trial patient populations. For the sake of simplicity, let’s use a linear model to parameterize the effect of age on our trial endpoint:

endpoint = c + β ∙ age

The parameter β tells us the degree to which a patient’s age influences the endpoint. After fitting this model to the historical data, we can predict the endpoint value and uncertainty in the current control arm by extrapolating the historical data.

Another common extrapolation problem involves translating between clinical trials with different durations. Suppose that the past trial in our example was run for 12 months, while the current trial will run for 18 months. How can we use the historical data now? The historical data covers most of the trial, but we might reasonably expect the endpoint to continue evolving with another 6 months of observations.

As in the previous example, we can use a statistical model to extrapolate to the longer study — again, a linear model:

endpoint = a + b ∙ t

The parameter b controls the variation of the endpoint value with time. As before, by fitting this model on the historical data we can extrapolate the endpoint value and uncertainty from 12 months to predict the values at 18 months in the current control arm.

For many diseases, a patient steadily progresses or has isolated flare states of increased severity. These patterns make it possible to predict a patient’s future state given their current and past disease trajectory. Furthermore, since historical data will generally encompass patients at differing stages of a disease, the historical data captures a wider range of disease progression than the trial duration would indicate. Therefore, historical data covering a shorter period than a current trial may still be useful for predicting the current trial’s endpoint.

When we use data from past clinical trials, we have the opportunity to benefit from the diversity of data drawn from many different studies. Having a mixture of patients at early, intermediate, and late stages of a disease allows us to build a complete model of disease trajectory. Similarly, we can apply a diverse set of historical data to many different scenarios — not just trials that match the design of a previous study.

Models are the key to unlocking the full potential of historical data to inform current trials with a broad spectrum of trial designs and patient covariates. Combining models with historical data even allows us to make predictions about a trial’s outcome given a particular trial design and particular patient population. But before we can rely on these predictions to make critical decisions in the drug development process, we must address the concern that using historical data will increase uncertainty and lead to a loss in power. In our next blog post, we’ll show how comprehensive models can be used to make robust, precise predictions about disease progression in clinical trial control arms.