REAL Fragment Library — Efficient fragment growing inside Enamine REAL

Anna Kapeliukha
5 min readApr 4, 2024

--

Today I want to dive deeper into the development of one of our recent products — REAL Fragment Library. It is a pre-plated set of 4,960 compounds that is designed to become the perfect entry point to Enamine REAL Space 2.7 trillion version and facilitate the process of fragment growing and SAR analysis in fragment-based drug discovery.

In this post, we will discuss the general idea behind it, its benefits, the scaffold annotation process, and review some practical application examples.

If you are interested, you can download the library here: https://tinyurl.com/mr3rpufh

P.S. We are also working on a version suitable for crystallography so stay tuned!

REAL Fragment Library — Project idea

Even though fragment-based drug discovery has already proven itself numerous times as a successful approach for hit identification, using this method usually comes with the hardship of growing the fragment after it was identified. Fragment growing can be performed in multiple ways including rational design and novel AI-based generative methods, but all of them come with a need for expensive and time-consuming custom synthesis. Besides that, to be able to perform SAR analysis and reach the desired active molecules one will need tens or even hundreds of compounds and the project can be slowed down a lot by the waiting time for their synthesis.

This is why we came up with the idea of using the modular nature of Enamine REAL Space to create a fragment library that will be the perfect entry point for using REAL compounds during the fragment growing stage. This opens the possibility of receiving your full molecules in 3–4 weeks and at a competitive price, streamlining the discovery process.

REAL Fragment Library — Overview and usage example

The library consists of 4,960 compounds that are pre-plated and can be ordered from Chemspace, though we support any kind of customization.

The compounds used were selected from Enamine in-stock fragments using the concept of scaffold sociability, which will be described in more detail further in this article. The selected scaffolds efficiently cover the available synthons scaffold space available in Enamine REAL.

To understand a general use case let’s review an example that is illustrated below.

A fragment screen yields a promising fragment and after completing the crystallography you obtain a crystal structure of this fragment in the protein. After analyzing the crystal structure, you can define the directions where you want to grow your fragment.

During the design of the library, we selected the scaffolds with the most synthon diversity and exit vectors, so that a large amount of follow-up compounds could be provided.

Then, you can define what kind of synthons you want to use to grow your fragment, or we can create a custom chemical space that could be explored by using computational methods. Our computational team will be happy to help with this step.

When the final selection is complete, the compounds will be delivered to your door in 3–4 weeks with an 80% synthesis success rate.

Additionally, after you obtain active full molecules, it is very easy to create sets for SAR evaluation as Enamine REAL contains a great variety of diverse synthons.

Fragments selection criteria — the concept of scaffold sociability

As the size of the library is limited, a thorough scaffold selection was required to yield the most diversity in terms of:

  • Number of unique reactions per scaffold
  • Number of unique synthons per scaffold
  • Exit vector diversity — number of unique angles
  • Exit vector coverage — coverage of growing direction quadrants in 2D coordinate space

While the first two points could be achieved by simple data analysis, the exit vector diversity analysis required the development of a custom algorithm. All scaffolds of the REAL Space synthons were annotated according to these criteria.

To select the fragments for the library we utilized the Enamine in-stock fragment collection. Compounds were evaluated by their physicochemical properties, substructure filters, and minimal pharmacophore diversity, followed by clustering using Flexophore [1] descriptor to make the final selection.

Custom annotation algorithm overview

All the synthons used for REAL annotation have a defined labeled reaction cite, which is used in the exit vector calculation. To simplify this process, we decided to use the 2D coordinate plane, as this approximation is enough to estimate the possible diversity.

The exit vector is calculated as the angle between the origin (0,0) of the coordinate plane and the first atom of the reactive group attached to the scaffold. Based on the obtained angles, it is possible to calculate the number of quadrants covered for a specific scaffold. In the example below, we have 3 exit vectors that cover 3 quadrants accordingly.

This concept is quite self-explanatory, but another thing we had to account for is the scaffold symmetry, as a benzene for example can have only 1 unique angle that can cover all 4 quadrants.

For this, we developed a separate algorithm that calculates the symmetry of atom positions in the 2D coordinate space. Thus, another metric was introduced — coverage score. The coverage score indicates the number of quadrants that can be covered by the exit vectors accounting for the symmetry. You can see some examples in the picture above.

Future perspective

The main goal of the REAL Fragment Library is to facilitate the process of fragment growing, but it also opens additional opportunities.

  • Fast SAR analysis. Due to the number of synthons in REAL Space, it is possible to rapidly develop compound sets for SAR and evaluate them both experimentally and computationally.
  • Development of custom focused compound sets.
  • Development of custom fragment libraries. If a company has a proprietary fragment collection, the annotated REAL scaffolds can be used to create custom libraries on demand.

This algorithm will be further advanced to be used in fragment merging and annotation of scaffolds with multiple exit vectors.

References

  1. https://pubs.acs.org/doi/10.1021/ci700359j

--

--