AI in Drug Discovery

Testing AI Generated Small Molecule Drugs

An Overview of Docking and Beginner's Guide to Docking Tools

Quinn Wang
The Startup

--

Drug discovery is defined as the process by which new candidate medications are discovered [1]. These processes includes:

  1. Vaccine development: vaccines are typically used to prevent viral infections. These are agents that resembles (could be a weakened or killed version) the viral strand. The goal of vaccines is to use comparatively harmless agents so that the human immune system can generate antibodies without getting sick, and will form memory of how to fight off similar virus in the future. An example of this is the rabies vaccine. If you get bitten by a street cat/dog you will need (okay 'need' is dependent on where you live) to get a rabies vaccine as soon as possible so that you are not 100% dead if the animal that bit you was infected with rabies. Fun story: I grew up in a place where people would die from dog/cat bites inflicted rabies (these includes even house pets, and sometimes just being scratched by them can cause rabies infection). A couple weeks ago I was bitten by a pet cat, and immediately I was extremely worried and wanted to get vaccinated. However was not able to because where I live in Canada now is known not to have rabies for over 20 years. The point is, if you have such anxiety towards rabies because of your background, and you live somewhere where doctors don't share that anxiety, AVOID animals despite the social pressure!
  2. Antibody development: antibodies are essentially what your immune system should produce to fight off a foreign agent. You can rely on your immune system alone to produce antibodies if either it has memories of this foreign agent to be able to produce effective antibodies fast enough, or the agent is weak enough to provide enough time for the human immune system to learn to produce enough antibodies. Otherwise, with novel and more deadly agents, injections of medically stored antibodies are needed. You may have heard of these used after venomous snake bites. If you can identify the type of snake that bit you and the hospital you are at have access to previously made antibodies of that snake venom, the injected dosage of antibodies will fight off the poisonous agent leaving your blood free from venom.
  3. Small molecule development: a small molecule is a low molecular weight organic compound. Small molecules can be used in drug to modify the disease process by disrupting one channel through which the virus replicates. Viral replication would require certain enzymes. For example, the HIV-1 protease, through a chain of reactions, creates mature protein components of the HIV virion. Without effective protease, the HIV virion will remain uninfectious. Small molecules such as Ritonavir can bind to HIV protease and act as an inhibitor and hence hinder the enzyme from catalyzing the reaction. As someone who specializes in computer science and is not at all a pharmacologist, small molecule drug discovery is a good entry point to this field since there are simulation softwares, in our computers, as support to validate binding abilities of a proposed molecule. There are also publically available data expressed in language interpretable by a computer that we can use as our training data. Sounds familiar to a machine learning problem now?

I explain how to interpret the standard SMILES string representation here:

The process of computer simulated binding between small molecules and proteins is called docking. The process of computer simulated screening amongst a library of compounds against potential drug targets is called virtual screening. In the next section I'm going to go over how to use the screening software PyRx with the docking tool AutoDock Vina to simulate our small molecule to protein binding.

First, you need to download PyRx for you operating system with the link below. Just as reference, I'm using the free version (version 0.8) for now.

You should see something like this once you open up the software:

PyRx 0.8 very old-schooled UI

Let's see how docking in PyRx works by using the example of HIV protease and Ritonavir binding.

The HIV protease file is downloaded from the Protein Data Bank. Download file with the regular PDB format:

Go to PyRx, click File → Load Molecule, and you’ll see the molecule displayed as a 3D model:

Next to the Molecules tab, click on AutoDock. You will see that the molecule we just imported is not yet listed here. Go back to the Molecules tab, right click and select AutoDock → Make Macromolecule.

Then the molecule should be added in AutoDock as a macromolecule.

Then download the Ritonavir ligand:

Go to File → Import, select Chemical Table File and choose the file you just downloaded (if you computer is acting up as mine and it opens the sdf in a web page, just copy and paste all content from that browser into a text editor and save the file as a .sdf file). This is what you will see:

Imported Ritonavir ligand

I’m going to first go to Edit → Preferences and set my CPU limit to 2 so it doesn’t take up all CPUs on my computer. Then in the controls panel, right click on the ligand and minimize selected. This is going to get the minimal energy for this ligand:

Right click again and select convert to AutoDock ligand:

Now in the AutoDock tab we have our macromolecule and our ligand:

We are ready to run some docking!

In the controls panel, select Vina Wizard and press start. Select the ligand and molecule, then press forward. In Run Vina, press Forward again and it will run.

When the program finishes running, we will get a report of the result:

We can see the binding affinity, where more negative numbers represents a more likely chance to bind (note that there will be some uncerntainties here). There are different affinities for the same ligand because AutoDock Vina is computing binding affinity for different orientations of this molecule-ligand bind.

This is an important tool to learn in AI drug discovery as an evaluation tool for your proposed ligands.

Resources:

[1]. “The drug development process”. US Food and Drug Administration. 4 January 2018. Retrieved 18 December 2019.

--

--

Quinn Wang
The Startup

Data analyst with an interest in machine learning. Passionate about understanding the theoretical backings of ML algorithms.