Deepmind’s AlphaFold AI system solved biology’s biggest problem

Sanjeev Arora
Second-Level Thinking
4 min readAug 7, 2022

--

Opinions expressed are solely my own and do not express the views or opinions of my employer or any other institution.

What is AlphaFold?

It is an AI system built by the scientists and engineers at Deepmind as a solution to the 50-year-old protein folding problem. This AI system is now able to predict the 3D structure of a protein just from its 1D amino acid sequence. AlphaFold has now predicted the three-dimensional shape of almost all known proteins.

What is a Protein?

Proteins are among the most important molecules in any living organism. They are like machines exquisitely designed to perform specific functions in a body. These functions could be transporting nutrients, attacking pathogens, etc., and are essential for life. Proteins are made up of amino acids that are arranged in a string. There are 20 possible amino acids to choose from and all proteins found in nature from insulin, and collagen to antibodies to fight viruses are made up of sequences of these 20 different amino acids. To really understand how each protein works it is essential to fully understand the 3D assembly, i.e. the position, the structure of bends, folds, etc. of these different amino acids. This unique, complex structure of the amino acids is essential to allow each protein to perform its function.

As explained by Deepmind, Proteins are the building blocks of life, they underpin every biological process in every living thing. And, because a protein’s shape is closely linked with its function, knowing a protein’s structure unlocks a greater understanding of what it does and how it works. Learn — how do proteins fold?

Deepmind video, 2020 — Protein folding explained —

Protein folding explained — Deepmind

What is the recent major breakthrough on AlphaFold?

July 2022, Deepmind in partnership with EMBL’s European Bioinformatics Institute (EMBL-EBI), released predicted structures for nearly all cataloged proteins known to science, which will expand the AlphaFold DB by over 200x — from nearly 1 million structures to over 200 million structures — with the potential to dramatically increase our understanding of biology. This update includes predicted structures for plants, bacteria, animals, and other organisms, opening up many new opportunities for researchers to use AlphaFold to advance their work on important issues like sustainability and accelerate drug discovery.

AlphaFold predictions referenced in publications, source: Deepmind

Why do I think AlphaFold is an emerging technology in digital biology?

Normally, understanding the structure of just one protein would take researchers 3–5 years. Some researchers might spend their entire Ph.D. figuring out how a single protein folds. AlphaFold AI system has improved to a level where it can now fold an average protein in a matter of minutes. It solved biology’s biggest mystery by predicting the structure of nearly every protein known to science. In addition, researchers looking to create new synthetic proteins can now start from an existing structured model to test their hypotheses and easily tweak that structure to make new discoveries.

Deepmind has also made all this research and the AlphaFold AI system open source and free to the research community to unlock future innovations. AlphaFold DB will open up huge new opportunities in digital biology (AI & computational models) and will have an impact on important issues like sustainability, food insecurity, and neglected diseases. For instance, the Drugs for Neglected Diseases initiative (DNDi) is advancing drug discovery for neglected diseases, such as Chagas disease and leishmaniasis, which impact millions within poor and vulnerable communities. Click here to further learn about its impact so far.

What is the level of accuracy of the AlphaFold AI system predictions?

According to Demis Hassabis, Co-founder & CEO of Deepmind, their team achieved the accuracy level down to atomic accuracy, less than 0.1 of the nanometer error, i.e. the width of an atom. An atom is a million times smaller than the thickest human hair. The diameter of an atom ranges from about 0.1 to 0.5 nanometers (1 × 10−10 m to 5 × 10−10 m). That’s the level of accuracy you need in order for the biologists or scientists to rely on those results so they can innovate in areas like drug discovery and disease understanding on top of computational predictions.

Where can the research community find the AlphaFold database?

DeepMind and EMBL-EBI will continue to refresh the database periodically. Access to structures will continue to be fully open, under a CC-BY 4.0 license, and bulk downloads will be made available via Google Cloud Public Datasets.

Useful links -

--

--

Sanjeev Arora
Second-Level Thinking

Focused on Disruptive Innovation, Business Model Innovation, Service Design, Digital Transformation Strategy, Product Innovation Management