Ride, Dr. Hofmann’s Bicycle: an HTVS-based Drug-Discovery Trip in The Chemical Space of LSD using AcePrep, AceDock, and Acellera’s Generative Models [TUTORIAL]

Roberto Fino, Dr.

--

Introduction

Albert Hofmann was a chemist in the natural product division of Sandoz’s labs (now Novartis) in Basel, and he was working on natural alkaloids extracted from plants with professor Arthur Stoll. He was able to extract and purify lysergic acid from ergot, and to react it with dimethylamine to create its dimethyl amide derivative: lysergic acid diethylamide (LSD). On the 19th of April of 1943, Dr. Hofmann decided to fast-forward directly to Phase 2 studies and ingested 0.25g of pure LSD to test the substance effects on humans; he then asked his assistant to escort him on his way home by bike and take notes on a journal of the physiological effect of the drug on himself. While the two were riding to their destinations, the LSD started to kick in: Hofmann described first a state of anxiety where he believed that his neighbor was a witch (who never thought anything like this?), then followed by a state of altered perception where objects around him mixed and turned into colorful spirals and crazy geometric shapes. Since then, the 19th of April has been celebrated as Bicycle Day to remember Dr. Hofmann’s first-ever LSD trip.

LSD blotter commemorating the Bicycle Day (Source: Wikipedia)

Besides its recreational uses, and inspiring the Beatles White Album and other great music, LSD has recently gained momentum as it was repurposed as a potential new anxiolytic drug. Boosted by the recent identification and characterization by X-ray crystallography of the 3D structure of LSD bound to the human 5HT2-B serotonin receptor (PDB accession code: 5TVN), medicinal chemists started their journey to decode the interactions of one of the possible binding targets of LSD.

In this tutorial, we will try to explore the chemical space nearby LSD starting from this structure with the 5HT2-B receptor. We will then use LSD as a template to generate a small library of analogs, and, finally, we will prepare the structures set up a 3D docking HTVS and inspect any interesting compound that may arise from this campaign.

We are going to use the following apps available on PlayMolecule:

  1. AcePrep: Acellera’s small-molecule preparator to prepare our ligands;
  2. ProteinPrepare: Acellera’s receptor preparator (supports any biopolymer, including RNA and DNA);
  3. Generative, LigaNN, LigDream: Acellera’s chemical-space exploration and de novo design generator apps;
  4. AceDock: Acellera’s 3D docking solution.

Let’s see what we can find.

Getting the structures

First of all, let’s take a look at the protein-ligand complex structure. The 5HT2-B receptor is a GPCR protein that is expressed in human neurons and its endogenous binder is serotonin (also know as 5-hydroxy tryptamine, or 5-HT); LSD and serotonin share the same tryptamine substructure (highlighted in green in the figure below) that is a key feature for binding:

Substructure matching of tryptamine for serotonin and LSD.

The structures depicted above are the neutral forms for LSD and serotonin; in solution, the non-aromatic nitrogen in both the molecules is protonated, because the pKa of its conjugated base is around 8. We can generate the right protomers using AcePrep.

Let’s get the structures from the RCSB PDB database; to get them, just click on the links below:

Preparing the structures

First, let’s prepare the receptor for docking going here: https://playmolecule.com/proteinPrepare/. Once landed on ProteinPrepare, we can start the preparation either by inputting directly the PDB code in the PDB id field or by uploading the structure we previously downloaded. Check the Remove water and Include heteroatoms in pKa calculation, select the chain (in this case there is just chain A), check the terms agreement, and, finally, click on the SUBMIT button.

ProteinPrepare landing page.

The calculations will take some time, so in the meanwhile, we can go to https://playmolecule.com/aceprep/ and prepare the ligand. Upload the SDF we previously downloaded; in this case, we won’t need to generate any conformer, because we need to preserve the coordinates in the original structure, so just give the job a name (e.g. ‘LSD-PDB’) and click on the SUBMIT button:

AcePrep landing page.

Let’s now retrieve the job outputs. We now have a fully-titrated receptor at physiological pH of 7.4 and the right protomer for LSD.

ProteinPrepare output for 5-HT2-B receptor.
LSD protomer generated by AcePrep.

Bonus: once we titrated the receptor and the ligand, we can use PlexView to have a visual depiction of the protein-ligand key interactions. Let’s go to https://playmolecule.com/PlexView/, upload the prepared receptor and the prepared ligand and submit the job:

Depiction of the key interactions using PlexView.

Generating new molecules

Now that we prepared the structures, it’s time to start our journey in the chemical space surrounding LSD. For this purpose, we are going to use both Ligand- and Structure-Based methods. LigaNN is Acellera’s solution for generating new, IP-free, molecules from a given protein-ligand complex using generative adversarial neural networks (GANN). LigDream can perform the same task, but starting from any SMILES string provided. Generative, instead, based on the CrEM library, is an elegant solution for virtual hit optimization, hit growing, and fragment merging. The main advantage of using CrEM, even though it is based on an apparently ‘simple’ enumeration of SMILES combinations, is that the authors of the software included only chemically-reasonable substitutions in the algorithm that generates the new chemicals: this makes the life of MedChems way easier by avoiding the generation of impossible-to-synthesize molecules. Furthermore, the CrEM approach is fast and exhaustive, with a good trade-off between performances and the quality of the output.

Beware: at the time of writing, both LigaNN and LigDream generate a CSV output with a list of names and SMILES strings; for this reason, we are going to use some python to convert the files to SDF for further processing. We won’t need this for Generative, as it already outputs an SDF.

Ok, starting with LigaNN: let’s upload the prepared structures into the app. I set the number of Ligand shape generations to 50 and the number of Decodings per shape to 20 to maximize the number of molecules in output. In this way, we will also generate some fragment-like molecules, but we can filter this easily with RDKit when we will convert the CSV to SDF.

LigaNN settings.

With these settings, I obtained slightly less than 800 molecules.

Let’s see what we can obtain with LigDream. Landing on the app page, we can leave the default settings and just check the RNN checkbox to give some noise to the generator to bias a bit away from LSD.

LigDream settings.

With Generative, I decided to use an approach based on fragment merging. Because of how CrEM is conceived, no new cores are generated here, but the main core is used as a canvas for the new chemical space. In the case of LSD, there is not too much space left for MedChem optimization, but we know the core pharmacophoric features of it: an indol and a positive charge. For this, I run 3 different runs with indol combined with 3 different cyclic secondary amines: aziridine (3), pyrrolidine (5), and piperidine (6). I generated overall 300 molecules, and then I took only the top 30 compounds sorting them by the quantitative estimation of drug-likeness (QED) index, which is calculated by the Generative app using the respective RDKit module.

Let’s see now how can we use python to convert the output of LigDream and LigaNN to SDF.

import os
import pandas as pd
from rdkit import Chem
# For LigaNN output
ligann_csv = pd.read_csv('/home/rubbs/Desktop/medium_aceprep/ligann-LSD/generatedMolecules_prot.csv',sep='\s')
lignn_sdf = []
for r in ligann_csv.iterrows():
rd_mol = Chem.MolFromSmiles(r[1]['smiles'])
name=r[1]['name']
shape=r[1]['shape']
rd_mol.SetProp('_Name',name)
rd_mol.SetProp('Shape',shape)
lignn_sdf.append(rd_mol)
print(f'Total molecules: {len(lignn_sdf)}')
# For LigDream output, we won't have the shape field, so we have to change things a little. The separator changes as well, so we'll leave the pandas default sepligdrm_csv = pd.read_csv('/home/rubbs/Desktop/medium_aceprep/ligdream-LSD/generatedMolecules_mol1.csv')# LigDream
ligdr_sdf = []
for r in ligdrm_csv.iterrows():
rd_mol = Chem.MolFromSmiles(r[1]['Smile'])
name=r[1]['Name']
parent=r[1]['parent']
rd_mol.SetProp('_Name',name)
rd_mol.SetProp('Parent',parent)
ligdr_sdf.append(rd_mol)
print(f'Total molecules: {len(ligdr_sdf)}')
# Let's define a custom Chem.SDWriter() function so we avoid DRY codedef custom_sdf_writer(cpd_list, file_out):
if not os.path.exists(os.path.dirname(file_out)):
print('Creating output directory')
os.makedirs(os.path.dirname(file_out),exist_ok=True)

w=Chem.SDWriter(file_out)
print(f'Writing {len(cpd_list)} molecules')

for m in cpd_list:
w.write(m)
w.close()
print(f'Saved SDF: {file_out}')
# Let's write the files out
custom_sdf_writer(ligdr_sdf, file_out='/your/custom/path/ligdream-results/ligdream_output.sdf')
custom_sdf_writer(lignn_sdf, file_out='/your/custom/path/ligann-results/lignn_output.sdf')

Prepare and Dock

Now that we have our ligands, we can use AcePrep to dock them. I decided to merge all the files into just one sdf library of 50 compounds.

I can now upload them on AcePrep and run preparation. I will have to generate just one conformation per compound, as AceDock accepts only 3D SDFs.

Final compound selection. Compounds look promising.

Let’s take a look at the library molecular descriptors distributions:

Charge distribution. Most of the compounds fulfill one of the required pharmacophoric features (the positive charge).

It looks like the compounds are complying with most of the common rules for drug-like small-molecule.

Common rules to define drug-likeness.

Molecular descriptors distribution look also good:

Distribution of common molecular properties descriptors.
Structural descriptors.

Besides the plots reported, AcePrep outputs also a report as an Excel file with the 2D structure depictions of the molecules, and a summary of all the SDF properties as the columns of the file.

Now we can download the prepared library and submit it to AceDock. We will upload the receptor prepared with ProteinPrepare, the prepared LSD structure, and the prepared library. I ticked the option Pharmacophoric rescoring keeping in mind that I generated molecules in order to match the same pharmacophore as LSD’s.

AceDock setup.

Let’s click on submit, and grab you something to drink, this will take as long as an hour (depending on how many compounds you have and your settings).

When AceDock is finished, a list of all the docked compounds with the desired number of poses will be prepared for download. You can use PlayMolecule viewer to display them or just download the files to inspect them in your favorite molecular visualizer.

In my run, I found a couple of interesting hits, this is one of them:

Nice hit.

After downloading the results, loading them in PyMOL, and further inspection, also this hit seems to be quite interesting:

Interesting hit #2 aligned to LSD (yellow lines).

Did you find any interesting hits? Let us know in the comment section!

P.S. In case you are wondering where the title of the post comes from, look here.

--

--

No responses yet