Analytics Vidhya
Published in

Analytics Vidhya

PyMOL for Visualization of the BRCA2 Complex and 1N0W, 1PZN, 1MJE, 1MIU Structures

A usage of PyMOL to show the 1IYJ BRCA2-DSS1 complex. (Image Source)

This article goes into how I used the visualization tool PyMOL for examining 3D representations of the aforementioned proteins, and looking into detail into various protein-protein and protein-DNA interactions.


  • BRCA2 — the Breast Cancer Type 2 susceptibility protein — The protein is crucial to repairing DNA damage and response pathways. Mutations of this protein have been linked to several forms of cancer.
  • RAD51 — a member of the RAD51 protein family that assists in repair of DNA double strand breaks
  • 1N0W — a crystal structure of a RAD51-BRCA2 BRC repeat complex that works in gene regulation and creating anti tumor proteins
  • 1PZN — a structure within the RAD51 complex that works in recombination and gene variation
  • 1MJE — a structure of a BRCA2-DSS1-SSDNA complex that works in gene regulation and creating anti tumor proteins
  • 1MIU — a structure of a BRCA2-DSS1 complex that works in regulation and creating anti tumor proteins
  • PyMOL — an open source molecular visualization system
  • PDB — Protein Data Bank — An open-source database of various biological macromolecular structures for research
  • NCBI VAST — a searching service that allows you to compare a newly resolved 3D structure against structures already in the PDB
  • Protein Domains — regions of the protein that are self-stabilizing and fold independently — Note that proteins can have several domains and can show inherent structural functionalities within the protein.


The NCBI VAST search can be used for analyzing these different structures in greater detail. This allows me to compare similarities in the tertiary structures of these different structures, which can allow me to look at related sequences to infer function of those structures.

Screenshot of taxonomic and general data about the structure for 1N0W. (Image Source)
Screenshot of similar tertiary protein structures and their relevant information for 1N0W. (Image Source)

For instance, looking at the first structure of 6HQU can glean information about 1N0W’s function if not known. By clicking its link, I could see that 6HQU is similarly related to BRCA2 as well as DNA repair processes. This was useful for initial research about the protein and its various functions.

PyMOL Visualization


1MJE’s structure in PyMOL.

Note in certain parts of the protein, there exists the presence of single-stranded DNA (ssDNA), which indicates a genetic defect in the creation of this protein. Additionally, by interacting with the object, it can be seen how the ssDNA slides through the groove of BRCA2 thus interacting with it.

The sequence window for 1MJE.

Another part of PyMOL is its sequence window which allows for viewing the amino acid code for the protein. There are various distinctive factors, including the 6 oligo DTs, the various chains that can be seen in C > by chain > by chain, and various residues which can be selected with C > reds > red.

1MJE split into chains and different proteins.

Focusing on the chains, we can note that the DSS1 protein can be seen in the BRCA2 structure and can note that it is deeply embedded in the BRCA2 structure. PyMOL allows us to visualize the interaction of this protein with the BRCA2 structure but doesn’t show the products that result from interaction.

Lastly, note that there is a part of the DSS1 chain missing from the structure where certain amino acids, which is a point of study for a future article. A common feature throughout the gene, certain portions of the BRCA2 gene are missing as well, highlighting issues with the transcription of the protein.


1MIU’s structure in PyMOL.

Next, note the 1MIU structure, which similarly has a DSS1 protein that interacts with BRCA2 in this simulation.

The protein domains for BRCA2.

The tower can be easily recognizable from the PyMOL visualization, and the domains, as well, after coloring in the various ones. Noting this, the function of the tower can be seen as causing proper binding of BRCA2 to DNA.


1N0W’s structure in PyMOL.

Opening the 1N0W and splitting it into chains on PyMOL, it can be noted that the longest chain is with the interaction of the protein with a RAD51 subunit and a BRC repeat domain. By zooming into the BRC repeat, we can analyze the interaction region.

By using C > by element > CHNOS, we can note that there are 2 red oxygens in the carboxyl group of the glutamate (E) of the BRC repeat chain, and that it is located on the 1548th position of the chain.

With PyMOL, it is also possible to measure the distance between certain particles. For instance by selecting Wizard > Measurement, the number of angstroms can be found between two hydrogen bonds or the two red oxygens that were aforementioned.

An example measurement between two parts of the 1N0W structure.

Measurement yielded a hydrogen bond distance of 2.3 angstroms, which is well within the typical hydrogen bond range of 2.2 to 2.5 angstroms. Measurement of the bond between oxygen and nitrogen yielded 2.7 angstroms, which was also in the typical bond range of 2.5 to 3.5 angstroms.


1PZN’s structure in PyMOL.

A similar BRC interaction can be done to the 1PZN model. First, align the RAD51 and BRC interaction elements and then orient the screen to view all three at one time. From here, we can see that there are multiple interaction regions, showing redundancy for less genetic variation.


Through this experiment, I learned a lot about the Protein Data Bank and how to visualize using PyMOL. I chose these four proteins arbitrarily and was just experimenting around with PyMOL, but I hope to do more analysis into the BRCA2 gene and use my past knowledge of bioinformatics to run analysis on the genetic sequence as well.


  • This article goes over how to use the Protein Data Bank and analyzing several protein interactions with BRCA2.
  • The Protein Data Bank is useful for finding various proteins and for ascertaining a certain protein’s structure.
  • PyMOL was heavily used for visualization and allows for the following: viewing the protein structure, align different structures, highlight various residues and structures, and measure inter-atom distances.

Further Resources

If you want to talk more, schedule a meeting: Calendly! For information about projects that I am currently working on, consider subscribing to my newsletter! Here’s the link to subscribe. If you’re interested in connecting, follow me on Linkedin, Github, and Medium.




Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem

Recommended from Medium

What you can learn from the publicly available FinCen data leak | Lampyre

Viewing the needed information on transactions and interbank connections

The Best Way to Learn Data Science

Data Science with Python: Intro to Loading, Subsetting, and Filtering Data with pandas

Natural Language Processing with Twint and Python for Premier League

Solving One Truly Big Number Problem in Transport

Things to consider BEFORE starting a Data Science Bootcamp.

The Center for Data Science welcomes new research seminars

Kaggle Submission for Titanic Dataset

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aditya Mittal

Aditya Mittal

More from Medium

Nostalgia Music

Exploring the Emotions of Anti-Vaxxerrs

using acummarray to average several columns at a time?

I analyzed 10,000 tweets about Formula one controversy and here’s what I found