Protein-Protein Docking

Machine Learning and Cloud Computing are a couple of the hottest topics in computational science right now, but one of the most important topics in computational drug design is something you may not have even heard of! Protein-protein docking — often simply referred to as docking — is a complex set of techniques used to determine how proteins will bind together. It is often a mystery how two proteins will interact in a solution. The astronomical amount of work devoted to solving this mystery spans many academic articles and years of research, so today we’ll focus on the basics.

“A computational way to determine the binding affinity of two proteins”

Docking in solution

In the real world, basic protein-protein docking is easy: put the two proteins in solution together in an ELISA test and measure how well they bind. This data is pretty much a definitive indication of binding. Unfortunately, ELISA has some shortcomings. Firstly, ELISA tests don’t show you where the two proteins come together, only if they do or do not bind. This is a problem for drug development, where it is especially important to know how two proteins dock. In some situations, one protein may block an active site of the other, preventing binding to a drug or molecule, which could mean the certain failure of a medicine or lead to other complications. Another key ELISA shortcoming is the cost. In contrast with computational docking, which only requires access to a computer, ELISA tests require active time in a potentially expensive lab facility, equipment, and pricey reagents.

Docking in silico

In order to use docking to test two molecules against each other, you need to first digitize the two proteins’ information. Many methods exist to determine the three-dimensional structure of a protein, but the most commonly used is X-ray crystallography. One major repository of this crystal structure data is the Protein Data Bank. The PDB makes these structures publicly available so that anyone can download them for computational experiments!

The docking of two proteins using PDB file 2XWT

In silico docking techniques

Making two physical objects fit together is pretty intuitive, so it might seem like docking proteins would be the same. It should be as easy as putting two puzzle pieces together, right? Put the two proteins together, see where they fit, and you’re done! In practice, though, docking techniques are much more difficult than that. Unlike puzzle pieces, proteins aren’t just governed by their shape; their interactions also rely on hydrophobics, electrostatics, hydrogen bonding, and a myriad of other effects. Proteins aren’t rigid, either. Sidechains can move to accommodate unexpected conformations — changing the shape and surface slightly. All of these factors make docking extremely computationally expensive. Even in a very simple case, there may be upwards of three trillion orientations of your proteins to test, and each “test” might consist of hundreds or thousands of computations. On a typical desktop computer, this might take well over a hundred hours of continuous work!

A plot of energy (score) versus the reference state (epitope RMSD) of multiple docked structures. Notice the funnel as the docked structures converge on a low energy conformation.

Such an exhaustive approach is unrealistic even for supercomputers, so various methods have emerged to simplify the search. One technique used to reduce computations is called Decoys As a Reference State, or DARS. This technique assigns one protein to move and the other to be stationary. It then generates thousands of random orientations of the mobile protein. The energy of each orientation is measured, and the docking conformations with the lowest calculated energy and optimal binding distance are chosen. Energies can then be plotted against the conformation’s distance to the aforementioned reference state. This plot may reveal a funnel shape. A good funnel plot (Figure 2) will show where the docking runs converge on a solution, and this convergence is evidence that the best orientation has been found! DARS is not as comprehensive as the exhaustive approach, but it can still find the optimal binding with less than a millionth of the number of movements! Overall, this statistical approach to docking significantly reduces the computation burden of docking, which makes it very important for computational drug development.

There are also other techniques, such as Asymmetric Decoys As a Reference State (ADARS) or programs such as PIPER which use Fast Fourier Transforms to compute billions of energy calculations. Due to the importance of protein-protein docking to drug design and development, this large sector will continue to grow as computational power increases and in silico methods become more prevalent. This processing is helping to push biotechnology to the cutting edge and improve the world around us.

Looking for more information about Macromoltek, Inc? Visit our website at
Interested in molecular simulations, biological art, or learning more about molecules? Subscribe to our Twitter and Instagram!