New Quantum Way of Mapping Whole Genome Sequence

Shubhojit Roy
Unknownimous
Published in
7 min readSep 27, 2020

“History repeats itself, in part because the genome repeats itself. And the genome repeats itself, in part because history does.” Siddhartha Mukherjee

The ability to sequence more than 60,000,000 DNA fragments simultaneously, the first generation of the Illumina Genome Analyser had revolutionized the ability of labs to generate a large volume of seq data in a week at an extremely low cost; data that previously required the resource of genome center. The diverse range of applications facilitated by the Genome Analyzer’s massively parallel sequencing technology, simple workflow, and a 100 decrease in cost brings us significantly closer to understanding the links between genotype and phenotype and in establishing the molecular basis of many diseases. In scope for establishment strong uphold for any molecular basis of disease the data need to be resolute but the Next-generation often accompanied by a cascade of gaps in between the sequence which is the terminal result of GC repeats in our genome and several other complexities.

The short-read sequencing of >150bp using next generation sequencing is not quite fully capable of understanding the complexity of gaps arising in the sequence these, are the result behind complex gene which demands high-resolution genome mapping with the integration of technology NGS reinvigorates efficiency of physical mapping. Though any high-resolution mapping using a technique like PALM, STORM use complicated and general protocol since bringing which destroys the fundamental output in discovering structural variation in a gene. The constant need for physical optical mapping of a single molecule to span the threshold any coverage brings the opportunity for a new quantum level mapping of a single molecule using a marker CRISPR Cas9 Nanoparticle with the use of DVD optics for the showcase of DNA mapping images

The advantage of using Physical DNA mapping is identifying the structural variants in gene and alignment of the haplotype genome. Physical mapping isn’t being used much in the human genomic as much when compared to the NGS due to physical restraints of low coverage, accuracy. For this, there’s a need to leap for the mapping using quantum technology for super high resolution.

Nanopore sequencing in theory can sequence each base and can identify long structural variants. Nanopore seq and its integration with physics law make it less usable in scalability with the stringency of using short no more than 10KB which is large for NGS compared with physical mapping are less. The long read sequence prevents frequent fragmentation of target DNA and bridges the gaps arising due to repeating elements or complexity. In Nanopore sequencing the sensors fixed while the molecule is mobile which creates additional difficulty because DNA being a long structure that can get curled up due to slow dilution in a solution. In optical mapping (e.g, Optical Mapping, FISH, DNA Combing) the long extended molecule is fixed while the sensor moves over it; the optical resolution is capable of reading approx 300KB. For Nanopore sequencing the use of a long primer for the DNA makes it a synthesis of template not a natural as for the optical mapping the resolution should be greater 100X for deep coverage and with a marker of accuracy >250bp for the computational alignment with implantation of this the identification of feature is always 2~3kb mismatch. This limitation supplements next-generation sequencing (Illumina, Pacbio).

The AFM is quite a distinct microscope for the visualization of the biomolecular process sacrificing the scalability to a 1x1micrometres window which is a problem when we’re trying to read a physical map of the genome. The new AFM in contrast is distinctive in its functionality to increase the fidelity through an unprecedented level increasing the coverage scalability and high data rates with sidestepping bandwidth bottlenecks. The labeling chemistry of CRISPR Cas9 is harnessed instead of its enzymatic properties of introducing nick in the DNA which is filled up with fluorescent-labeled dNTP such methodology is lengthy and introducing nick in a short span of DNA is unstable and in a long DNA two nicks could be relatively closer to each other. In contrast, our methodology uses the chemistry of Cas9 labeling for physical maps of DNA.

CRISPR Cas9 chemistry of labeling

The CRISPR Cas9 along with Sg-RNA is incubated with Target DNA in presence of formaldehyde preventing Cas9 dissociating from Sg-RNA and being used as labeling for a single molecule level without any enzymatic activity. The Cas9 forms a complex with Sg-RNA and Target DNA, the unbound Cas9 is excluded from the sample using a wash buffer, and the rest is placed on the mica plate. Prior, to imaging the sample along with the mica plate is being heated up to 120 degrees Celsius; step increases the efficiency of physical mapping. The HS-AFM is used to record the high-speed resolution data in 17x17mm. The tip of AFM moves in the air from 17x17mm to record 1000x1000 pixel data and the cantilever is guided using a single compact lazer vibrator with a 100 million height adjust ratio over a piezoelectric X, Y, Z-axis motor. The data is precise with an accuracy of 90% and with a single-base mismatch is about 50% incrementing to 0.5 to 5 nano pixels.

source: Nature

To fix the DNA on the Mica plate use a positive surface charge of DNA using Magnesium ion which is quite effective in this case. Using Mg ion for DNA fixation on mica plate procures an irreversible fixation with a great affinity of the sample determining the contour length of the DNA with forces stable enough on the mica plate to adsorb efficiently.
The efficiency of Cas9 is determined with a series of genetic targets like BRCA1, HER, TERT, specific to the Sg-RNA complex, and the precision of the labeling on a single molecule. The Cas9 form in a sequence specific complex with Sg-RNA with the precision of 90% near to the target seq. A small peak of height approx of 3nm constitute by Cas9 complex with Sg-RNA is visible using an HS-AFM.
For e.g, The Sg-RNA is synthesized complementary to the subset of Alu repeats of the BRCA1 seq, Alu repeats are the most abundant seq of BRCA1 as a tumor suppressor gene. Any Mutation, SV inclined to Alu repeats being responsible for the inhibition of immune checkpoint (tumor suppressor). In the process, no wild type Cas9 is observed responsible for the nicking of the DNA strand. The composition of Cas9 and Sg-RNA is important in binding to seq specific with 90% any number less than this contemplates the adjustment in the ratio of there presence which can be effectively modified to increase the binding specificity. The concentration of Sg-RNA is also important in determining the binding specificities in the sample.

https://www.nanowerk.com/nanotechnology-news/newsid=48758.php

The peaks of small globular particular manifest the labeling of Cas9 on the Alu repeats of BRCA1 gene seq of 12,500bp.

Spatial Resolution

The spatial Resolution i.e, precision in locating the labeling DNA molecule an important parameter for DNA mapping. The accuracy of HS-AFM is linear for the range of 100–4550bp but some sequence thread are long and some are short to identify short tandem repeats for genetic fingerprinting the resolution should high enough to discriminate between 2–3bp and this could be orchestrated for the mapping of any structural variation in the seq, the HS-AFM is capable in producing high speed and resolution of labeled DNA and the image quality is high enough to be rendered in online. Cas9 is also being used in the biosensor to detect any change in the conformation of DNA (Triplex, Quadraplex) along with any DNA associated protein molecule like histone.

Data Scalability

The sophistication of protocol like this insistence high data storage and analysis this hinders the scalability of this operation. In this operation the most distinctive complex hardware is Single Compact Dopler Vibrometer Lazer for the detection of any DNA labeled molecule at a nanoscale precision with the 2–3bp difference these can also be achieved by using an off-shelf DVD optical picking unit (OPU). OPU astigmatically focuses the cantilever inclination to the sample. The OPU is placed over the scanner to focus on the laser bright spot changing the shape of the photodiode detector of the cantilever as they move in nanometre-scale precision. The nanometre-scale precision is obtained using OPU and the implementation of electrical batteries for any electrical noise. The OPU has onboard voice coil actuators to focus on the detection beam and integrated laser component incrementing its high wavelength to >10Mhz amplifier making it ideal for detection for high displacement changing sense. The image quality is similar to the fact of Dopler Vibrometer Laser. With no modifications, this OPU HS-AFM can be used to detect and measure length polymorphisms between closely spaced markers (e.g, closer than about 400 bp). Alternatively, multiple Cas9 labels can be used as simple single-molecule barcodes, making the technology useful in counting applications such as detecting gene copy number variation, digital PCR, or transcriptional profiling. Achievable improvements in signal-to-noise and drift compensation will allow the OPU to reliably detect the DNA backbone as well as the Cas9 labels.

source: nature

The throughput of the data could be increased using more molecules per frame with a distinct computational algorithm incrementing the window size and pixel density.
Nanomaping approach HS-AFM-OPU is complimentary for both sequencing and physical mapping of SV and SNP. It may be possible to combine HS-AFM nanomapping with hybrid capture or “inverse” long PCR, for instance, to isolate translocations breakpoint regions using only limited knowledge of the loci involved in the translocation. For a workable and more scalable future, the fact apprehend is to extend the DNA count-length from 13kb with high data throughput.

--

--