Obvious Ventures has been fortunate to work alongside some of the leaders in computational biology for many years. Having been investors in both Zymergen and Recursion Pharmaceuticals since Series A, we have seen firsthand how powerful this next-generation approach to biology can be. These companies are leading the charge on a new method of investigating biology, one that is premised on machine learning driven by experimental data instead of root-cause oriented scientific research.
Informed by the underlying philosophy of these leaders, we have been looking for an alternative approach to protein engineering that leverages computational techniques for over two years.
Introduction to Proteins
Genes are the functional units that dictate biological function, organism makeup, and even behavior. Chemically, genes are segments of DNA sequences that encode instructions to make proteins, which in turn carry out most cellular functions. To make a protein, DNA is first transcribed and processed into messenger RNA (mRNA), and the mRNA is then translated into proteins. This process within every cell is the metabolic engine of life, creating a vast set of proteins that are combined and enzymatically catalyzed into the chain reactions that direct cellular, and ultimately organism function.
If genes are the source code of life, proteins are the applications. Proteins play critical roles in cells — catalyzing reactions, sensing, signaling, and providing structure. Consequently, engineered proteins are of high therapeutic and industrial value. As the first functional unit of output from DNA translation, proteins are intimately involved with all aspects of both cellular and biological activity.
A protein’s function is determined by its 3D structure, which is formed from a folded chain of amino acids. There are 20 distinct amino acids, and the assembly of these is what leads to the universe of possible proteins. The search space for this problem is massive: proteins in the human body have a median length of ~400 amino acids — which means that there are 20⁴⁰⁰ variations of the underlying amino acid code.
Because of the vast combinatorial complexity in protein construction, the current approach to protein design is to start with known proteins in nature or a library of known physical properties mapped to sequence. There are various databases with thousands of entries of know protein sequences and measured protein characteristics. Protein engineers have used this “prior art” method as a starting point, taking mutagenesis from there. This approach is limited by the incomplete map that we have of protein function and the underlying uncertainty that we have in our rationalizations of biological mechanisms of action. This method is further limited from exploring the majority of the protein search space since it must start from a subset of known protein locations.
Stepping back, it’s worth noting that protein therapy is a massive therapeutic category within the pharmaceuticals industry. Seven of the top ten best selling drugs today are protein therapies. Roughly 25% of the $950B in annual global pharmaceutical sales today are in the biologics category (protein therapy, gene therapy, cell therapy) and this figure is expected to increase even further. The sheer complexity of protein behavior has crippled the pace of research and advancement in this field.
LabGenius: An Alternative Approach
LabGenius is taking a completely different approach to protein engineering based on large-scale lab experimentation and machine learning. The company follows a closed loop synthetic biology flywheel blending simulation and machine learning (in digital space) and high throughput, in-vitro experimentation (in biological space).
Most importantly, LabGenius does not encode any pre-existing rules on protein function. The platform solely relies on experimental outcomes to inform protein variants of interest, and uses the underlying sequences of these “winners” to inform the next library that will be tested. This search process allows LabGenius to create proteins never before found in nature and traverse the amino acid space in a methodical and comprehensive way.
LabGenius was founded by James Field, who developed some of the underlying principles and platform during his PhD program at Imperial College, London. When I first met James, I was struck by the novelty of his vision and how well he had thought through each aspect of the LabGenius platform. As we were able to spend more together in the following months, I became convinced that he is growing into an incredible leader and operator as well. We are excited to team up with our friends at Lux Capital and co-lead LabGenius’ Series A.
By embracing a high-throughput experimentation and machine learning approach, LabGenius aims to take a drastically different path to evolve high performing proteins. There are some nuances and challenges with this search process, but we believe that it represents the future of protein engineering.
We are excited to partner with LabGenius to build that future.