Reconnecting Synbio R&D to Manufacturing

Sestina Bio
Sestina Bio
Published in
6 min readFeb 25, 2022

by Mona Mirsiaghi and Andrew Horwitz

Nearly 60% of all supply chains depend on natural product or petrochemical inputs. Sourcing these inputs comes at great environmental cost, and many powerful molecules with significant commercial potential are simply too rare to develop into viable commercial products. At Sestina Bio, we are harnessing the power of biology to build new supply chains for products that address the biggest challenges of our time. To achieve the massive impact of this opportunity, we must successfully navigate “the valley of death” of scaleup and succeed in manufacturing. Why have scaleup and manufacturing been so difficult in Synbio, and how can we chart a more cost effective and reliable path to making real products?

What is the purpose of Synbio R&D?

This sometimes gets forgotten, but the purpose of Synbio R&D is to send the right strains to scaleup and manufacturing. Owing to costs, only a tiny fraction of strains that are built and tested in R&D can ever be promoted to screening bioreactors, which are the first real look at manufacturing potential. To promote the right strains, it is critical to predict manufacturing performance at the high throughput screening phase. This has become increasingly important as the gap between what we can build and what we can promote to bioreactors grows. Early pipelines relied on manual strain design, artisanal builds, flask-based growth models and simple chromatography assays. Today, room-scale automation of strain builds and microtiter plate-based assays with high throughput mass spec readouts are standard. These new R&D platforms are much more expensive and can build and test many more strains but the resulting datasets still power models that are only ~15% accurate in predicting bioreactor performance. This is bad in three ways. First, it is indisputable that poor predictive models necessitate more rounds of fermentation screening to identify strains that meet the minimal requirements for scaling. We estimate that doubling the predictive power of current R&D models could reduce strain development costs 10-fold and time-to-scaleup 5-fold. Second, typical models have both weak and transient predictive power. In our past experience, models must be continuously updated as their predictive range is limited beyond the set of strains they were trained on. Third, and most significantly, because the model chooses the only strains that will ever have a shot at scaleup, it determines the limits of our manufacturing success. Put differently, what is being left on the table?

Intense efforts have been made over the years to improve these models, with limited success. We have lost faith that the problem can be solved with standard data sets focused on measurements of product (titer). Titer is important — product is our end goal — but unfortunately there is a hard trade-off between titer assay cost and predictive power (Figure 1). On one end of the spectrum is the high throughput titer assay, with a correlation of only ~15% to bioreactors and a cost of ~$1 per data point. On the other end of the spectrum is a 200 KL manufacturing vessel, with perfect correlation but a cost of at least $1 million dollars per data point (a rough estimate of total cost to take a strain to a run of that scale). Of course, no one intentionally screens strains at 200 KL scale, but even 250 mL screening bioreactors cost >$1,000 per run, which is three orders of magnitude more expensive than a high throughput titer assay. As a result, companies are forced to make a drastic down selection of candidates from high throughput screening to bioreactors using very poor information. We are shooting ourselves in the foot by letting weak models decide the fate of the entire Synbio effort.

Figure 1: Trade-off between assay cost and prediction power. The predictive power of today’s high throughput titer assays is very low. Much better correlations can be achieved in bioreactors, but even the smallest vessels are too expensive to use for large scale screening.

Pharma’s approach

Pharmaceutical companies face a similar challenge. Starting with libraries of up to 1 million compounds in early binding assay screens, pharma must select a small handful of compounds for promotion to pre-clinical in vivo models and beyond. Because of rapidly escalating costs at these later stages, pharma companies layer diverse data sets on top of their high throughput binding assays to “fast fail” compounds likely to be problematic from toxicity, bio-availability, or even synthetic chemistry tractability viewpoints. In recent years, innovators like Recursion and Tempus have adapted more sophisticated methods such as high content imaging and advanced computational techniques to further narrow and enrich the pool of hits that are assessed for promotion (Figure 2). The goal of these approaches is to generate high dimensional data sets that improve predictive power, and to do it cheaply, at the high throughput phase and scale, where this information is actionable. No one would dream of promoting a binding assay hit directly to Pre-clinical or Phase I, yet that’s pretty much what Synbio does today.

Figure 2: Pharma uses high dimensional data to de-risk candidates at earlier stages of development. Left panel — after a high throughput binding assay, Pharma fast-fails problematic candidates using a variety of secondary assays, including newer approaches like high content imaging (e.g., Recursion and Tempus). Right panel: Ethical concerns aside, consider the disastrous alternative, where binding assay hits are promoted directly to Clinical Trials, and failures are realized at the most expensive stages of development.

Sestina’s approach

At Sestina, we are developing novel data assets at high throughput scale to better navigate the transition from high throughput screening to first bioreactor run (Figure 3). These assets range from imaging approaches that allow incorporation of diverse cell state data into our prediction models to pooled screens in bioreactors that assess the fitness of 1,000s of strains at once under manufacturing conditions. Taken together, these data assets complicate the definition of “a good strain”, moving from a titer-centric view to a more holistic assessment that better approximates the complexity of biology and increases our ability to see into the scaleup and manufacturing future.

Figure 3: How to massively reduce cost and risk in scaling up Synbio products. Left panel: the current state in Synbio. Candidate strains are evaluated for titer and growth in 96 well plate high throughput assays, which are insufficient to accurately predict bioreactor performance. Poor promotion decisions mean longer timelines, higher cost to scaleup and much greater risk in manufacturing. Right panel: At Sestina, we are incorporating high dimensional data with the goal of reaching 90% accuracy in our predictions. The ability to select better strains at the high throughput screening phase and scale will be transformative for achieving manufacturing success.

We believe that the benefits of this approach extend well beyond selecting better strains. In the same way that rules of drug design learned by pharma constrain and focus the chemical matter used for subsequent screens, we are improving the composition of our strain libraries as we learn hidden rules of design. In other words, we aim not only to be better at finding the needle in the haystack, but to improve the needle to hay ratio over time. Of course, there are other pitfalls en route to successful commercialization of a Synbio product (to be discussed in a future blog) but in our view, recognizing that the “valley of death” actually begins in R&D is an important first step to achieving the scope of impact of Synbio that we need.

--

--