Analysing the spike protein of SARS-CoV-2 with 3DM

Bio-Prodict releases suite of 3DM systems to support COVID-19 drug and vaccine R&D, you can read up on that here.

In this post we will demonstrate what are the capabilities of 3DM using an example of the SARS-CoV-2 spike protein. You’ll see how 3DM systems can make protein research faster and more effective, and how they can help you generate hypotheses.

3DM systems are protein super-family platforms that collect, combine and integrate many different types of protein-related data. They can facilitate protein engineering, drug design, vaccine research and any other protein R&D that requires in-depth knowledge about the molecular mechanisms of a certain protein family. You can find more information on 3DM here.

The spike protein is a large protein located at the surface of coronaviruses and constitutes the infamous of the virus. Because the spike protein is exposed it could be easily accessible for human antibodies — and that makes it a good candidate for vaccine research. It is also part of the coronavirus’s attack mechanism, as it binds to the human ACE2 receptor which then mediates entry of the virus to human cells.

Let’s have a look at the SARS-Cov-2 spike protein in the context of the whole protein family, can we maybe find some interesting differences between SARS-Cov-2, SARS-Cov, and other family members?

Protein structure

PDB ID 6VSB, Wrapp D. et al. Science 2020

The spike protein is a homotrimer, each chain consisting of approximately 1200 residues. The human ACE2 protein is bound by the top part of the Unfortunately, for now, there are no X-ray structures available for the complex of SARS-CoV-2 spike protein and the human ACE2. However, there are structures of spike proteins from other virus species (e.g. SARS-CoV) bound to ACE2 — we can use 3DM to information about ACE2 binding from the SARS-CoV proteins onto the SARS-CoV-2 structure and investigate what are the potential differences and/or similarities in ACE2 binding between the two proteins.

In our spike glycoprotein system we created a of positions that are in contact with the ACE2 receptor in structure 6ACJ. This hotspot basket is accessible to anyone using the system. Using the tool in 3DM we can view these positions on our SARS-CoV-2 structure (6VSB). This is the receptor binding domain (RBD) of the two structures superposed on top of each other:

PDB ID 6VSB, Wrapp D. et al. Science 2020; PDB ID 6ACJ Song W. et al. PLoS Pathog. 2018

Blue is the SARS-CoV-2 structure and green is the SARS-CoV structure. The SARS-CoV-2 structure is missing some residues (the chain breaks are indicated with pink dots), but we can infer information about these from the other structure. In general, the binding interface looks very similar in the two proteins, almost all residues that contact the ACE2 protein are conserved (with the exception of position 498 where SARS-CoV has tyrosine and SARS-CoV-2 a glutamine residue). An interesting thing that we see here is that the conserved residue on position 488 forms a cysteine bridge — with a cysteine (position 480) from the loop that’s missing in the SARS-CoV-2 protein. We can have a look at the alignment now, what we see in the closest proteins is that whenever there’s no cysteine on position 488 the whole loop is also missing. So it’s present in both SARS-Cov-2 (though not crystallized in the structure, we only see it from the sequence alignment) and in SARS-CoV, but not for example in Bat SARS-like coronavirus proteins, even the ones that are very close to SARS-CoV-2 in terms of sequence identity.

These cysteines are definitely important for stabilising this loop, can that also be a factor in binding affinity to the ACE2 receptor? That we don’t know, but it’s just one of the hypotheses that you can raise by using 3DM and have it experimentally tested in a lab. Moreover, further into our investigation we do find an article that confirms the loop’s importance for ACE2 binding in SARS-CoV:

Our next step of this analysis would be a comparison of sequences with and without the loop (and with and without the cysteine on position 488) and checking if there’s a correlation with evolutionary pressures on different positions. We’re not going do that right now as it’s just a short demo to showcase the tools in 3DM, but that’s something that you could follow up on after getting access to the 3DM system.

Sequence projection and literature search

Let’s see if we can find out more about spike protein’s mechanism of action from another tool in 3DM — sequence projection. We’re going to visualize (i.e. mutations that we mined from the literature and were able to map them onto other proteins in the alignment). Position 493 is one of the residues most often described in the literature, which is a good hint that it is an important residue. We can view the abstracts and see if there’s anything of interest for us.

One of the articles, titled “Identification of two critical amino acid residues of the severe acute respiratory syndrome coronavirus spike protein for its variation in zoonotic tropism transition via a double substitution strategy.” mentions in the abstract that:

The residue described in the article is referred to by position 479 — that’s because the SARS-CoV sequence is slightly different and the equivalent of the SARS-CoV-2 Q493 is N479. Similarly, the position 487 of the SARS-CoV protein is equivalent to position 501 in SARS-CoV-2. Luckily, you don’t have to spend time figuring out which positions from SARS-CoV correspond to which positions in SARS-CoV-2–3DM does that for you!

Coming back to that abstract excerpt — that gives us important information — since in the SARS-CoV-2 protein this residue is a glutamine and in SARS-CoV an asparagine (thus, they can make similar interactions), it is likely that such a substitution would have the same effect as in the SARS-CoV protein. Next step from here would be to investigate that second residue described in the article (T487, that’s position 501 in our alignment). It turns out that in the SARS-CoV-2 protein this residue is not a threonine but an asparagine — that might mean that the virus has found a way around this deleterious substitution, maybe different residues have taken over the role of the 479&487 residues from SARS-CoV?

Maybe you can figure it out with the use of 3DM and further wet-lab experiments! We’ve shown you only a small subset of 3DM’s capabilities — how it can help you generate hypotheses and narrow down possibilities that can later be confirmed in the lab.

If you’re interested in learning more please contact us or have a look at the 3DM walkthrough.

Bioinformatics solutions for protein engineering, drug design, and DNA diagnostics #3DM #helixlabsai

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store