SkeleDock: Exploiting structural knowledge [TUTORIAL]

In this tutorial we will explain how to use SkeleDock, a scaffold docking software freely available in PlayMolecule. We will explain how we can use it to dock a congeneric series, circumvent scaffold hops or model complex macrocyles. Let’s dive into it!

Alejandro Varela
PlayMolecule
6 min readMay 29, 2020

--

Predicting the binding mode of a ligand in a protein pocket is one of the holy grails of computational biology. A great effort of this community has been focused at developing novel approaches to tackle this task. The most popular among these methodologies is called docking. Without entering into the details, docking tries to predict the binding mode of the ligand by using a so called scoring function, which evaluates interactions between the two molecules according to a set of geometry based rules.

Hence, these algorithms only need the protein structure and a initial ligand conformation to work. However, for some proteins, more information other than just the (apo) structure of the protein is available. In fact, for those proteins which are or have been subject to the interest of the pharmacological industry, several protein-small molecule complexes can be found. If holo structures exist, a sensible way to proceed would be to use such knowledge, for example, by building a pharmacophore and filtering out the poses which do not fit it. Yet, if the co-crystallized molecule has a common motif with the molecule to dock, we can do something smarter.

Similar molecules bind in a similar way
Given that similar molecules bind in a similar way, we can identify the common atoms, align the query molecule to the template based on these shared atoms, freeze them, and place
all the others in a an appropriate location using a classical scoring function. This is, roughly, what SkeleDock, does. We invite you to follow along the steps in PlayMolecule. Let’s try it!

Building our way up from a fragment. SARS-CoV-2 main protease.
Diamond’s team was able to crystallize several fragments in the catalytic site of the main protease of SARS-CoV-2, and Postera.ai created an online platform to allow researchers to submit compounds inspired in those fragments. We can use SkeleDock to model the binding mode of these compounds using their corresponding fragments, let’s see an example.

This is how one of the co-crystallized fragment looks:

And this is the 2D depiction of the molecule we are trying to dock (compound MAK-UNK-105–15):

Finally, this is the main menu of SkeleDock’s app:

We can start by uploading the appropriate files (all the needed files can be downloaded from here): The .pdb file of the template (co-crystallized fragment), the .csv file with the smiles containing the query molecule, and the .pdb file the protein (without the fragment). Then, we just need to select the options that suit better our scenario. Let me explain them real quick:

  • Optimize with rDock: Atoms which do not have a template equivalent must be placed somewhere. To do this, we use rDock’s tethered protocol, which allows us to freeze the position of some atoms, and explore others. In some cases you might not need to run rDock, for example, if your query ligand is almost equal to your template.
  • Probe radius: rDock can use the template molecule to build the docking cavity, that is, the 3D space where molecules will be docked. We can place an sphere of radius X on each atom in the template, and use the volume that these spheres occupy as the cavity. We can select the value of X using this option. If your fragment is much smaller than the query molecule, you should probably use a large value like 12, if they are the same size, you could even use 4 or 5.
  • Tethering force: Aligning both molecules based on their common atoms is usually not enough to achieve good poses. We can further increase the overlap with the template using tethering.
    One can picture tethering as linking the atoms in the query molecule to their template counterparts with a spring. Tethering force would then be the constant of this spring. A higher force brings the atoms closer, achieving better overlap, but be aware! because high forces can create artifacts and break chirality.

For this example, let’s try to increase the Probe radius to 10 and the tethering force to 50. Your pose should look something similar to this:

Docking a congeneric series: Cathepsin S
Fragments are easy… What about something larger and more complex? Let’s try to dock a congeneric series to a single template. For this example we will use Cathepsin S, a target with an abundance of holo structures which was part of the D3R Grand Challenge 4. We can use the same options as before, and maybe even increase the probe radius a little bit, until 12. Bear in mind that increasing the probe radius, and hence the docking cavity, makes SkeleDock slower.

This will be our template.

After docking the congeneric series contained in the .csv file, we can look at some of the results. The template has a triple bond that prevents some of the query molecules to overlap the template on that region.

Other queries that also have a triple bond can achieve greater overlap.

Facing scaffold hopps. Circumventing local mismatches
At this point, you might be wondering about that mysterious “scaffold-hopping tolerant mode” option. It’s common in the world of medicinal chemistry to introduce substitutions in a molecule to increase its potency, solubility, or other ADMET properties. These substitutions can break or reduce the size of the common moiety shared by both molecules, leading to a poor alignment, how can we handle this?

SkeleDock’s autocompletion step comes handy in these situations. You can read the publication to know the details, but essentially, after the first round of mapping and alignment, we check where the mapping has stopped and we try to extend it, even if it means assuming a few mismatches. This allows us to map rings to not-rings, atoms of different elements, etc. For example, in the following picture, after the first round of mapping, only seven atoms have correctly been mapped to the template, leaving two entire branches free (ring structures must be broken to allow this step). The positions of all those free atoms will be decided by rDock’s algorithm, unless the scaffold-hopping tolerant mode option is selected, which will ignore the faced mismatches, leading to a better pose, shown in the right side. Only a couple of atoms are now free, and we were able to model the macrocycle without any additional work.

That’s all! We hope that you found this tutorial helpful and that it has encouraged you to try the app :)

References:

  • Varela-Rial, Alejandro, et al. SkeleDock: A Web Application for Scaffold Docking in PlayMolecule. May 2020, http://arxiv.org/abs/2005.05606.
  • Ruiz-Carmona S, Alvarez-Garcia D, Foloppe N, Garmendia-Doval AB, Juhos S, et al. (2014) rDock: A Fast, Versatile and Open Source Program for Docking Ligands to Proteins and Nucleic Acids. PLoS Comput Biol 10(4): e1003571. doi:10.1371/journal.pcbi.1003571

--

--