Developing bifunctional protein degraders for difficult-to-drug proteins with the Receptor.AI Platform
Protein degraders is a new “big tech” in drug discovery
Proteolysis-targeting chimaeras (PROTACs), and lysosome-targeting chimaeras (LYTACs) are emerging as promising alternatives to conventional small molecule drugs for the targeted degradation of proteins. PROTACs and LYTACs are molecules that consist of three components: a “warhead” that binds to the target protein, an “anchor” that binds to ubiquitin-ligase or a lysosome trafficking receptor and a linker that connects them together. PROTACs facilitate ubiquitination, which marks the target protein for degradation by the proteasome, while LYTACs flag the protein for being transported to lysosomes and degraded there.
This novel approach has several advantages over traditional small molecule drugs. One of the key features of PROTACs and LYTACs is their ability to target proteins that are not accessible by small molecule drugs. This is because PROTACs and LYTACs do not target the active sites of proteins, do not act as concurrent inhibitors and do not aim at interfering with the protein function in any way. The warhead may bind virtually anywhere at the protein surface, and the only requirement is a sufficiently strong binding, which attaches the protein to ubiquitin-ligase or trafficking receptors. This allows PROTACs and LYTACs to have much wider applicability and to work with target proteins, which are traditionally considered undruggable.
In addition, PROTACs and LYTACs have the potential to be more potent and selective than traditional small molecule drugs. This is because they use a targeted degradation mechanism rather than a traditional inhibitory mechanism, which is always limited by the concentration-dependent effects.
Thus, it is not surprising that there is a high demand for the rational design of new PROTACs and LYTACs. That is where Receoptor.AI comes to play with our state-of-the-art AI technologies that allow the very fast and efficient design of PROTACs and LYTACs with superior properties and a high success rate.
Receptor.AI platform for PROTACs and LYTACs design
The Reeptor.AI platform allows the designing of both PROTACs and LYTACs using similar workflows. The platform utilises the paradigm of an iterative design-test-modify development loop, where experimental feedback is incorporated instantly into the design workflow.
Let’s consider the workflow of PROTACs (Fig. 2) as an example.
General provisions of warhead and anchor design
We start with predicting the most promising binding pocket for the warhead binding on the protein of interest (POI). This may or may not coincide with the functionally active sites and is facilitated by a dedicated AI model. In parallel, the site for anchor binding is identified on the desired ubiquitin-ligase.
Both POI and the ligase are subject to initial virtual screening using the dedicated module of our platform, which incorporates two affinity prediction AI models and a proteome-wide selectivity prediction.
After that, selected candidate compounds are forwarded to the secondary screening model, which filters them by 60+ ADME-Tox endpoints, physicochemical parameters and drug-likeness metrics. The docking with AI-rescoring follows, which prioritises the best candidate compounds.
At this stage, we can also incorporate available data from experimental high-throughput experiments, such as DEL screenings.
A big advantage of the Receptor.AI approach to PROTACs and LYTACs design is the usage of a well-tested AI platform, which was validated on multiple small molecule drug design projects. The warhead and anchor moieties of PROTACs and LYTACs are typical small drug-like molecules combined by a linker. Thus, the small molecule design workflow could be used separately for them in the first stage.
After that, the experimental validation of warhead affinity to POI and the anchor affinity to the ligase is performed, and the results are used for the retraining and fine-tuning of AI models used for subsequent iterations.
The architecture of the Receptor.AI small molecule drug discovery platform is described in detail in our End-to-end pipeline Booklet.
Ligand design stage 1: Target exploration and preliminary screening of warheads and anchors.
In this stage, the Receptor.AI examines the POI and Ubiquitin ligase and determines the most appropriate binding pocket on each target (Fig. 3). The Receptor.AI also performs preliminary virtual screening against a small stock chemical space to get starting information about the binding propensities of common chemical scaffolds against the targets of interest.
Pocket prediction
For proteins with no known binding pocket or several possible binding pockets, the platform performs pocket prediction by using two possible techniques:
Pocket prediction by AI model trained on all known pockets in PDB. The model has an accuracy of ~75% and gives a good preliminary estimate of the pocket location for the proteins, which have a reasonable amount of homology with the protein-ligand structures in PDB.
Pocket prediction from the literature data. If the functional residues are known from experiments, the pocket could be predicted in accordance with this data.
Pocket prediction by the reverse diffusion model. In the challenging cases when the protein has no close homologs in PDB, and the experimental data are scarce or unreliable, we employ the AI model inspired by DiffDock, which performs a blind search for a plausible location of the binding pocket using the modern reverse diffusion approach.
The final decision about the pocket, which will be used in subsequent stages of the pipeline, is based on the results of preliminary virtual screening and experimental validation, as detailed below.
Virtual screening
At this stage, the virtual screening is performed with a minimal set of parameters. The chemical space is initially pre-filtered with substructural filters to exclude the compounds which are not suitable for coupling with the linker moiety. Two AI models, namely DTI and FB-DTI, are used for primary screening. While the DTI model is used only for the assessment of the compound’s binding propensity and selectivity, the FB-DTI plays an important role in excluding the compounds that bind in inappropriate poses (with the liker coupling area directed inwards into the pocket).
Molecular docking with AI rescoring is used for secondary screening. No filtering by ADME-Tox, physchem parameters and drug-likeness is performed. However, all these parameters are assessed and reported to the user.
Screening is performed for each pocket separately for proteins with several binding pockets. The compounds are attributed to the most plausible pocket using the docking results.
Integration of experimental feedback
Receptor.AI advises using the fastest and most superficial experimental validation at this stage, such as express affinity assays. The goal of validation is not to determine the perfect binders but to detect non-binders and to teach the models to avoid them in the following stages. In addition, the most promising binding pocket is determined by examining where the experimental binders are supposed to bind according to docking predictions.
The DTI model is re-trained based on experimental feedback to be used in the next stage.
Deliverables:
- The most promising druggable binding pocket for each target.
- Optimised target-specific AI models for virtual screening.
- Virtual preliminary hit candidates suitable for linker coupling (1000 for each target, 2000 in general).
- Cost of compounds from stock library ~20$/compound.
- Experimentally validated binders from the first iteration.
Estimated timeline:
- up to 2 weeks for the computational stage;
- up to 3 weeks for the delivery of compounds;
- up to 4 weeks for biological validation.
Ligand design stage 2: Discovery of potent warheads and anchors.
At this stage, the set of future anchors and warheads is identified, and additional data used in later stages is generated (Fig. 4). This stage is repeated until sufficient affinities and safety profiles of the warhead and the anchor are achieved.
Chemical space
A diverse multi-billion chemical space composed of the databases of several major vendors is used (~30B compounds). It is possible to use both enumerated and combinatorial chemical spaces. Additionally, the pharmacophore search could be performed in these spaces prior to the virtual screening in order to select the compounds with desired pharmacophore signatures. Custom chemical spaces of any size could also be used.
The same pre-filtering and substructure search is performed to exclude the compound where the linker coupling is sterically impossible.
Virtual screening
The virtual screening is performed using the “level 1” stack of technologies:
- DTI and FB-DTI affinity prediction models.
- Level 1 selectivity prediction model (proteome-wide DTI rank over ~9.3K proteins).
- Standard ADME-Tox assessment and filtering (40 endpoints). The user can choose any set of ADME-Tox parameters and set the filtering cutoffs for them prior to the screening.
- Molecular docking with AI rescoring.
Integration of experimental feedback
At this stage, Receptor.AI advises using more sensitive and elaborate affinity assays, which allow for obtaining reliable data about at least ~100 potential warheads and anchors. The goal of validation is to prioritise affine compounds and use the knowledge about their molecular scaffolds to train target-focused AI models for the following stages.
Deliverables:
- In silico-derived anchor and warhead candidates suitable for linker coupling (1000 for each protein target).
- Synthesisable compounds: ~300$/compound
- Experimentally validated potent ligands from the 1st iteration.
- Optimised target-specific AI model for the lead discovery stage.
Estimated timeline:
- up to 2 weeks for the computational stage;
- up to 8 weeks for the synthesis and delivery of compounds;
- up to 4 weeks for biological validation.
The linker design and PROTAC assembly
In parallel, the linker design path is followed. The system can either utilise known linkers or design the linker de novo using generative graph models with rigidity and length constraints. Prospective linkers are assessed with the ADME-Tox and drug-likeness filters to ensure their safety.
De novo linker design
The custom design of the linkers can also be performed a posteriori in the later stages after collecting experimental data from assembled PROTACs. The classification model can be trained on the most successful PROTAC adapter-linker-warhead combinations. Also, data on existing linkers with custom-tuned rigidity can be used. As a result, the above model will classify likers with the help of the generative graph model and will determine the most promising ones.
PROTAC assembly and whole-molecule screening
The best warheads, anchors and linkers are then subject to the combinatorial in silico synthon-based assembly into whole PROTAC molecules. Up to 10 of the most promising novel, assembled compounds could be further assessed by the FEP method. Selected candidate molecules are transferred for experimental validation using functional assays, and their results are fed back to the platform for AI model retraining and tuning (if needed).
We propose an iterative validation process, which increases the success rate by decreasing the number of tested variables on each iteration (Fig. 5).
First, the molecules with novel warheads but with existing anchors and linkers are assembled, tested in silico and forwarded to experimental validation.
The following metrics are tested:
- DC50, the concentration at half-maximum degradation;
- Dmax, maximum degradation;
- SH2O, aqueous solubility;
- Papp, cell permeability;
- Foral, oral bioavailability.
Second, the novel anchor is added while the existing linker is used. The molecules are again assembled, tested in silico and subject to experimental validation.
Finally, the novel linker is added, and the totally new PROTAC or LYTAC assembly is tested.
The process may end at each of these stages if the molecule with sufficient activity and desirable ADME-Tox properties is found.
Such a procedure also allows the adaptation of existing designs of PROTACS to novel targets and ubiquitin ligases seamlessly and iteratively.
Membrane permeability prediction
In addition to this standard procedure, which roughly follows the workflow for small molecules, special attention has to be paid to the membrane permeability of designed PROTACS. Being large and bulky molecules, they do not easily cross the membrane unless designed to do so from the ground up. We are leveraging the proprietary technology of designing membrane-targeting drugs, which is being developed in Receptor.AI, to ensure that the designed chimaeras are amphiphilic molecules with a high propensity for membrane crossing (Fig. 5).
For LYTACs, the membrane permeability is less important because most of their target receptors are exposed on the outer membrane surface, and the molecules are internalised by endocytosis without crossing the membrane compartment boundary.
Deliverables:
- In silico PROTAC candidates (100).
- The experimentally validated novel hit compounds from the 3rd iteration.
- Optimised target-specific AI models for the lead discovery stage.
Estimated timeline:
- up to 2 weeks for the computational stage;
- up to 18 weeks for the synthesis and delivery of compounds;
- up to 3 months for biological validation.
Finally, if obtained PROTACs and LYTACs pass all the quality control criteria, they are designated as hit compounds.
Conclusion
PROTACs and LYTACs are promising categories of drugs capable of overcoming the major limitations of conventional small-molecule drugs. However, a relatively small amount of data regarding their activity and SAR has been gathered to date due to the novelty of such bifunctional degraders as a therapeutic modality.
That is why the usual drug discovery workflow, which is based on the exploration of large chemical spaces in conjunction with generalised proteome-wide activity prediction models, is currently relatively ineffective for the de novo design of PROTACs and LYTACs. In contrast, the iterative scheme, which designs each of the functional and linker moieties separately and integrates all available experimental feedback on each design stage, allows compensation for the lack of data.
It is also necessary to note that the classical physics-based approaches of assessing affinity are barely applicable for degraders in most cases due to the size and complexity of the ternary degrader-protein-enzyme complexes. This makes an AI-based design with iterative experimental validation even more attractive in terms of cost and development time.
That is why we believe that the Reeptor.AI platform provides much-needed tool for the rational design of PROTACs and LYTACs reinforced by continuous experimental feedback, which is capable of delivering novel bifunctional degraders in the most reliable and cost-efficient manner.