The Future Of Biology Is Full-Stack

At the Interface of Hardware and Software

Justin Butler

From the initial medical use of x-rays in 1895, to the sequencing of the human genome in 2003, capturing and analyzing biological information has promised to revolutionize our health. Countless lives have been saved and trillions of dollars of value have been created by collecting and analyzing this data. But there is still tremendous opportunity technology has yet to unlock. We still lack the ability to diagnose and treat patients on an individualized basis. The holy-grail of ‘personalized medicine’ appears to be perpetually just over the horizon.

The advent of artificial intelligence tools has ushered in a new era in healthcare. Using new AI tools we can use simple blood draws to detect cancer during curable stages, and determine your risk of a heart attack just by looking at your retina. These advances are sure to make significant improvements in the lives of many patients. There are, however, many fundamental areas of biology that artificial intelligence alone cannot solve. These algorithms and toolkits are inherently limited by the quality and quantity of the data they receive.

The breakthroughs needed for the next generation of medicine require new, full-stack approaches integrating new hardware and software. In order to make better decisions about health, we need better signal about biology. Full-stack technologies will elucidate that signal.

Three areas that will significantly benefit from this approach are: 1) advanced manufacturing for personalized therapies, 2) higher resolution biological systems, and 3) custom computer chips for biological workloads. These technologies will generate real advances in the understanding of the human body and the delivery of healthcare, allowing us to truly change the shape of medicine in the future.

Advanced Manufacturing for Personalized-Therapies

Personalized therapies, such as cell-therapies and neo-antigen vaccines, are showing promise for several currently intractable diseases such as cancer, Type 1 diabetes, rheumatoid arthritis, cardiac conditions, and many others. This promise has been validated by several recent acquisitions — Gilead purchased Kite for $12B and Celgene acquired Juno for $9B. While these types of treatments have been studied for decades, only recently has the medical community understood their mechanisms and side effects well enough to be utilized broadly in humans. The wide adoption of these therapies is limited by their manufacturing, which faces three main hurdles:

1. Scalability

These highly-custom processes require specialized expertise and equipment. In order for a new therapy to be brought into the clinic, the current industrial standard requires an unjustifiably large capital investment. The current model of pharmaceutical manufacturing does not lend itself well to making single doses of custom formulations, charging not “per-dose” but on an asset reservation basis, regardless of its usage. There is a significant opportunity to develop a model for manufacturing that will economically scale with a therapy as it goes from early clinical trials to clinical production.

2. Supply Chain and Logistics

The pharma supply chain is inherently one-way: make a large batch of product and ship it out. The tools, techniques, and processes needed for a two-way supply chain (receipt of patient samples, modification, and return to patient) are not widely-held. To enable this kind of workflow, both hardware and software innovations are needed, from ensuring chain of custody to safe and effective transportation of biological samples.

3. Asset and Labor Utilization

Historically, COGS have been a minor part of pharmaceutical manufacturing. With gross margins at nearly 80% across the industry, automation and other efficiency measures have not been a priority. The margin structure of cell-therapies is fundamentally lower margin when using industry standard methods. Personalized treatments will require modern manufacturing techniques, maximizing labor and asset utilization to ensure they are profitable while at a price point to enable widespread availability.

High Resolution Biology

Firms making advances in AI software have set their sights on deciphering signal from the tremendous amounts of noise that is the current state of biological tools. In parallel with these efforts, however, significantly stronger signal is needed from the core toolset.

As an analogy: imagine a self-driving car that has used artificial intelligence to decipher data from visible cameras, lidars, and radars to identify a stop sign. The dynamics of a roadway make this a formidable challenge, light conditions, positioning, and occlusion of stop signs require advanced software to recognize them without fail. There are, however, some static conditions in this example; the stop sign is always red, always an octagon, and always stays “STOP”. Imagine now, identifying a stop sign that is constantly changing shape, color, and messaging. That second scenario more accurately describes a biological system. Cells, DNA, RNA, and proteins are constantly morphing, making them elusive current to tools. While tremendous advances have been made in genetic sequencing and understanding that many different types of genetic codes that can describe biological functions, we are still far from sensing and understanding the biological world in a way we can sense much of our physical world. There are several limiting factors in understanding this biological world in a similar fashion:

1. Isolation and characterization of biological components

To truly understand the impact of a given genetic code, we must first isolate the location and sequence of that code. Current tools might pick up genes across a population of cells (a cancer tumor, for example), whereas understanding the code of a single cell or single protein are instrumental to determining the true genetic characteristics of those components. Only after the individual characteristics of a cell or protein are determined can a personalized treatment be properly designed, administered, and monitored for efficacy.

2. Interaction of biological components

Understanding the order of the genetic code is important, but trying to describe an entire biological process based on the genetic code is analogous to describing the color of a house by looking at its foundation. We need tools that allow us to understand how the genetic code creates biological components and how those interact chemically and biologically with their counterparts within cells and organs. There are many processes that occur before, during, and after gene expression which are not yet well understood. Developing tools and software to provide insights into these events will give a significantly broader understanding of what our genetic codes means, and how to use it to improve our health.

Custom Compute for Biological Workloads

As we develop tools for higher resolution biology, the software workloads used to convert this data into useful signal will require specialized hardware. We have seen this trend recently with the development of many custom computer chips for artificial intelligence workloads. A similar approach is applicable to genomic workloads. In one example, researchers at Stanford University were able to speed up the genomic sequence alignment of long-read DNA sequences by up to 15,000x by utilizing custom compute hardware.

Currently, about 50% of the cost of a long-read DNA sequence is in the compute. By significantly reducing the time required, the Stanford researchers referenced above also were able to improve the contiguity (an analogue for quality) by 800x. These improvements in speed and accuracy significantly increase the value and applicability of these new types of genetic sequence.

The creation of additional bespoke compute architectures applied to problems like protein interactions, imaging, and other novel types of biological systems will be necessary to bring the processing times and costs into a realm that enables their wide adoption and influence on the broader healthcare market.

Opportunities

We are at an unusual point in the development of new hardware and software tools for biology and healthcare. The unprecedented low cost of hardware development, combined with rapid developments in AI are providing an entirely new attack vector for solving the biggest problems in healthcare. Neither of these disciplines alone will make the advances needed to truly deliver the promise of personalized medicine. Together, however, these technologies will allow us to realize the goal of treating patients rather than treating diseases. The teams that build systems that are able to collect novel biological signals and process them in useful, meaningful ways, at an appropriate cost are going to change the world, while saving a single life at a time.

Justin can be reached at justin@eclipse.vc