A Novel Quantum Algorithm for Protein-Folding: Paving the Way Toward Resolving One of the Biggest Mysteries in Biology With Quantum Computers

Published in

Qiskit

6 min readAug 19, 2021

Agiotensin II in complex with Angiotensin Converting Enzyme. Image: Ivana Vanjak via Wikimedia Commons

By Rafi Letzter, Technical Writer at IBM Quantum

How do proteins fold? Researchers using Qiskit have taken a significant early step toward helping to resolve this important mystery in biology. In the process, they showed the significant potential of quantum computation to tackle problems in the domain of natural sciences.

Proteins have puzzled scientists since at least the late 1960s. These long chains consisting of components called amino acids are present in every living organism, acting as complex biological machines. Their functions range from catalyzing reactions within cells to providing the physical structure of feathers, hair, nails, and hooves, to serving as motors in muscle tissue. Proteins must fold into intricate shapes in order to perform these functions, locked into place by links between individual amino acids. A typical protein folds into its final form within micro- to milliseconds of forming — and no one knows how this happens.

But this new research by scientists on the IBM Quantum team may pave the way toward figuring this mystery out. And you can begin experimenting with the tools used in this research yourself using Qiskit here.

Why protein folding is so hard

If you were handed an un-folded protein (scaled up to, say, a few feet long) and you wanted to fold it, you might begin by testing out different ways of folding it and linking up the amino acids along the chain. But it would quickly become clear this is an impossible task.

“If you want to check every possibility of how the proteins link up, this will be an exponential problem,” said Panagiotis Barkoutsos, a scientist at IBM Research Zurich who worked on the paper.

With each additional link on the chain, the problem takes more and more work to solve until it becomes astronomically difficult. The molecular biologist Cyrus Levinthal figured out in 1969 that if proteins were taking this plodding, check-each-possible-fold approach in our bodies, their folding process would take longer than the entire lifetime of the universe. This is called Levinthal’s Paradox.

“It would take forever,” Barkoutsos said, if our bodies approached the problem by classical sampling. “But our cells have a mechanism that allows this to be done in micro- to milliseconds. So there must be something that governs the dynamics of this process that we just don’t understand.”

Researchers have already discovered some pieces of the picture. As journalist Sarah Everts reported for Chemical & Engineering News in 2017, researchers have discovered chemical “chaperones” in cells that seem to aid in protein folding. There’s also evidence that these chaperones play a role in the actions of proteins after they have folded. But the complete picture is yet to emerge.

Figuring out how proteins fold and unfold so quickly will likely require a reliable, quick method of modeling protein folding on computers. So far, no one has managed to make this work.

The difficult classical approach to the problem

Classical computers are stuck with the check-every-possible-fold approach to protein folding. That means Levinthal’s Paradox applies: No supercomputer is powerful enough to make any real progress on its own. Alternative to the sampling solution, one could also use classical molecular dynamics to drive the folding process. However, the quality of the available classical force fields as well as the computer power needed to solve the corresponding equations of motion are not sufficient to model the folding process, either.

To reach the necessary computational power to do classical protein folding research, Stanford University bioengineering researcher Vijay Pande developed a project called Folding@Home. Anyone on the internet can download software to their computer or mobile device that links it to the global Folding@Home network — effectively a giant, distributed supercomputer making use of unused processing power on people’s private machines. That network has grown to become one of the most powerful computing systems in the world, and has produced important results. But so far, it’s only able to model relatively short protein chains — chains much smaller than a typical protein found in a body.

Finally, AI-based approaches have also been used to predict protein folded structures; however, in this case the algorithms (mainly based on neural networks) require an extensive training using databases of known protein structures, which is not necessary in a brute-force sampling approach as the one proposed in the IBM Quantum researchers’ study.

The quantum approach

IBM Quantum researchers Anton Robert, Panagiotis Barkoutsos, Stefan Woerner, and Ivano Tavernelli showed that quantum computers should be able to tackle the problem much more efficiently than classical computers.

In a paper published Feb. 17th in the journal npj Quantum Information, they demonstrated that generic quantum algorithms used for optimization could be repurposed for folding problems. And they successfully folded a model protein on an IBM Quantum 20-qubit processor using their new approach.

Rather than spend computational resources checking each possible fold of a protein, the quantum approach encodes the superposition of all physically meaningful ways of folding the protein into a model Hamiltonian. Then it samples these combinations statistically to find the series of folds that are the most stable.

“Everything in our body wants to be in the minimum free-energy configuration,” said Barkoutsos. “It is the most stable. And we generally say in nature — in our bodies — the most stable is the winning configuration.”

With this approach, the researchers simulated the folding of a 10-amino acid chain called Angiotensin with a 22 qubit quantum simulator, and a seven-amino acid neuropeptide on a 20-qubit quantum computer.

Multipurpose, open-source code

This work relied on a modified Variational Quantum Eigensolver (VQE) algorithm, a major optimization tool in the quantum arsenal, drawn from the Qiskit Application Module.

To suit their needs, the team modified the VQE to sample only the parts of the data relevant to the problem, producing a tool they called a Conditional Value-at-Risk (CVaR) VQE. In line with IBM Quantum’s open-source ethos, this tool is now available to the Qiskit community as of the latest Qiskit Nature release.

Other members of the community should be able to repurpose the code used in this research for a range of applications, Barkoutsos said. “The proposed algorithm is valid for any type of polymer, not just proteins, and therefore, in addition to biology, it is also applicable in domains like polymer chemistry and material design.”

Solving the paradox

Fully resolving Levinthal’s Paradox will require other researchers to take this algorithm and build upon it, then run their more advanced algorithms on more advanced quantum hardware expected in the coming years.

In the real world, proteins can be hundreds or thousands of amino acids long. And they fold in three dimensions. This paper abstracted the problem, placing the proteins on a three-dimensional tetrahedral lattice (that is, a triangular pyramid) with limited freedom of movement. Therefore, this work is an approximation, as three dimensional movements in free space are much more difficult to model. The algorithm’s purpose is to set a foundation so people can start thinking and building on top of it.

There’s reason for optimism about the long-term prospects of this line of research, said Barkoustos. “With a quantum computer, every time you add an extra qubit you’re doubling its computing power. With every new qubit, you will be able to simulate bigger and bigger systems.”

If quantum computers enable researchers to simulate the folding of proteins at the scale of real living cells, Barkoutsos said, then they can begin to chip away at Levinthal’s Paradox and work out how these molecular machines fold so quickly and perform their catalytic functions. Where that level of biomechanical understanding will take science, no one knows.

For more stories like these, follow the Qiskit Medium!