Protein Folding with Quantum Annealing for Drug Discovery
Imagine you have to understand just one element that can make you walk, talk, watch and do any other activity you could ever imagine. What would that single element be?
Proteins are what makes you alive. Almost any activity in your body is linked to it. Proteins could be a helpful resource for everyday living activities and as new medical therapeutics. Indeed, let’s imagine insulin, renin and many other proteins and peptides that influence our body.
New proteins can help leverage people’s health to create new therapeutics or solve different problems. Indeed, let’s consider Alzheimer’s disease, Parkinson’s disease, Huntington’s disease, and cystic fibrosis. They are all related to the accumulation of misfolded proteins. That problem happens when proteins don’t fold correctly to their lowest energy state.
Those proteins, as described before, can negatively impact the health of the cell regardless of the protein’s function.
This article will describe The Protein Folding Problem and How it can be solved using Quantum Annealing. Furthermore, we will go deeper on how we can potentially create new drugs using Quantum Annealing principles.
Quantum annealing (QA) is a global minimum of a problem in a given set of candidate solutions.
The main idea is to use quantum fluctuations to find a procedure that finds an absolute minimum size/length/cost/distance from within a vast possibly but finite set of possible solutions using quantum fluctuation-based computation instead of classical analysis.
Quantum annealing is generally used where we only have a discrete number of possibilities, such as problems with optimisation or sampling. There, we can find the local minima that will be the global minima considering the size of the selected problem.
Quantum Annealing has different possibilities, such as finding the ground state of a spin glass or the travelling salesman problem. That would solve different problems that we currently would not be able to solve!
In 1988, Apolloni, Cesa Bianchi and De Falco proposed using Quantum Annealing to find the global minimum. After some years, different scientists invested their time to formulate some applications.
When we consider Quantum Annealing, we have to take care of two main points:
- The Hamiltonian
- The Eigenspectrum
The Hamiltonian of a system is Kinetic energies + Potential energy of all the particles associated with the system. The Hamiltonian takes different forms, and we can simplify the system by considering various features. Those include single or several particles in the system, the interaction between particles, potential energy, time-varying potential, or time-independent.
For example, imagine you have 2 apples; One apple is on the table, and the other is under the table. The apple under the table will have a lower energy value of the Hamiltonian than the apple o the table.
That idea comes from Quantum mechanics, and it is in honour of William Rowan Hamilton.
A quantum state is a mathematical entity that provides a probability distribution for the outcomes of each possible measurement on a system. Imagine if you don’t want to pay taxes, you would only buy objects in airports.
In general, airports are pretty far from each other, but if you travel extremely fast, you can “jump” from one airport to another, having a different energy level. Our world works in that way. There are different “allowed” energy levels to be and could potentially find a connection to others. The main idea is to find the lowest energy level of the system.
The closest point between two energy levels is called the minimum gap. If the gap is small, the problem is hard. If the gap is large, the problem is easy. Doing the Annealing too fast or molecule fluctuations could potentially lead to a different level of energy that is different from what we would want.
How does Quantum Annealing work?
1) It starts from a quantum-mechanical superposition where all the potential answers have equal weights.
2) The amplitudes of all candidate states keep changing. That leads to quantum parallelism, which causes quantum tunnelling between states.
quantum parallelism means that quantum computers solve each point in parallel (meaning simultaneously) as every other point.
3) We find the global minimum (in case there aren’t fluctuations or other high changes in energy levels
Analogy & Advantage
Quantum annealing can indeed outperform thermal Annealing when problems are extremely complex. When we talk about Quantum Annealing, we tend to use Quantum Monte Carlo methods. Those are different methods used in Quantum Computing to describe an accurate approximation of the system.
Monte Carlo Methods
Quantum Monte Carlo encompasses a large family of computational methods whose common aim is to study complex quantum systems. The term Monte Carlo city comes from the city in Monaco, famous for gambling. Indeed, Monte Carlo is a simulation evolving randomly.
Imagine that you have two containers that can contain randomly dropped balls. After some time, when you compare their weight, you can understand the proportion in size without doing anything else. In other words, it follows the Law of large numbers. Imagine if you have to understand the quantity of lights that hit a small size area. We need to represent a group of samples employing randomness. We pick the infinite directions, and we see what happens. Quantum computers can achieve Quadratic Speedup of the model simulation compared to a classical computer.
Francois Rabelais used to say: “Science without conscience is but the ruin of the soul”. I would say, “Science without using it is just a loss of time”.
Indeed, Quantum Annealing will be extremely important in the future for different problems in Optimisation and Sampling. For example, if you want to understand the lowest state of energy of a protein, you need to use a Quantum Annealer to understand that. Currently, D-Wave is the best platform to study Quantum Annealing.
Protein Folding Problem
After seeing what Quantum Annealing is, let’s go back to the main problem, what’s the Protein Folding Problem?
We have to define two main problems, the first one is physical, and the second one is computational.
- The physical problem is a question about how proteins fold so quickly. Indeed, proteins could fold even in nanoseconds. When a protein folds, it starts from a different structure, but it always has one form.
We define “Levinthal’s Paradox” as a statement to answer How does the process of searching and sorting through the protein’s large conformational space of disordered states happen so rapidly? And how is the same native state reached from so many different starting conformations?
In any case, even if there is a stochastic method, it’s not random at all!
- The second folding problem is computational: predicting a protein’s native structure from its amino acid sequence. Most current protein structure prediction methods make some use of database-derived conformational preferences. It would be desirable to achieve high-resolution protein structure prediction in models that do not rely on the information contained in protein structure databases. That would be possible by using and understanding protein structures and driving forces on a deeper and more physical foundation.
Protein structure prediction can be used to determine the three-dimensional shape of a protein from its amino acid sequence. Indeed, the function of the protein depends on its structure.
Computing the full electronic wavefunction of an average drug molecule numerically is expected to take longer than the age of the universe on any current supercomputers
Conventional Approach to Protein Folding
Today, a popular newly-developed approach uses machine learning to predict protein structures and protein fold recognition from amino acid sequences. Indeed, AlphaFold is the most interesting artificial intelligence program which performs predictions of protein structure.
The core of this is a neural network fed with data on how amino-acid sequences can map out protein structures.
The AI system learns from the previous data and can predict a protein segment’s structure, modelling based on what came before and after it. These deep neural networks then indicate the distances between different pairs of amino acids and the angle between their chemical bonds.
The main problem with the current system is when the problem increases in size. Indeed, Computers cannot compute proteins more significant than 150 residues classically, and it is still impossible to fold all the protein combinations into a computer. Impossible means that it would need too much time before giving us a clear answer
Quantum Computing Approach
Quantum Computers is an entirely different approach that can reach new levels of precision and prediction.
It works with discrete optimisation problems divided into 3 main steps.
1. Define the number of Qubits needed for the operation by calculating the number of Ammino Acids present in the protein chain. Regardless of how information is encoded, the bit string must uniquely enumerate each element of the low-energy solution space.
2. Constrain Energy Landscape with Pseudo-Boolean Expression. Construct a pseudo-Boolean energy function 𝐸(𝒒) = 𝐸(𝑞1, 𝑞2,⋯, 𝑞𝑛) that takes 𝒒 (qubits) as input and correctly reproduces the relative energies in the low-energy subspace of the original problem so that the optimal solution to 𝐸(𝒒) encodes the solution to the problem.
3. Map the Boolean Representation. Suppose one wishes to implement the energy expression on a quantum device. In that case, it may be necessary to manipulate the energy expression to contain only local fields and two-body couplings. In general, the problem can be implemented as QUBO on currently existing architectures for adiabatic quantum computing.
“Turn Ancilla” Construction of 𝑬(𝒒)
Now that we have a mapping function, the idea is to penalize folds where two amino acids overlap.
The energy function, 𝐸overlap (𝒒), will give this penalty that returns a high value if and only if amino acids overlap.
Potentially, we can create this function even if we divide the system into 3 parts. For example, 𝐸back (𝒒) penalizes the particular case of overlaps because the chain went directly backward on itself.
After exploring that, the next step is to consider the interaction energy among the different amino acids. 𝐸pair (𝒒) will ultimately determine the structure of the lowest energy fold.
Insing-type Hamiltonian Econdings
In general, we tend to map Proteins into a bidimensional system because we would not have enough qubits to do in other ways.
Indeed, a three-dimensional space would require Nlog(N) qubits. One way to encode lattice proteins is to use what’s called turns. We can define the Hamiltonian with the turn ancilla scheme by the following components.
- The H-back component: If two consecutive edges within a protein fold go between the same pair of vertices, it penalizes the lattice protein fold. Going back on itself would make the fold invalid.
- The H-redun component: If two 3-bit strings are of the same value, they don’t encode any valid direction within the grid, and the action is invalid
- The H-olap component: It has the same function as the H-back part. It is introduced is to reduce the required ancillary qubits.
- The H-pair component: marks interaction between non-bonded acids adjacent on a lattice.
In this article, we have described the Protein Folding Problem, What Quantum Annealing is and two methods to understand the minimum value of energy level of proteins.
From this perspective, the work will unlock different opportunities like drugs for a more specific group of people.
Indeed, the higher the variety of drugs, the higher the precision for any patient.