Many serious illnesses are directly related to our genetics. Heart disease and cancer are known to “run in the family” whereas Huntington’s disease and cystic fibrosis come directly from genetic mutations. Knowledge of how these diseases work allows researchers to find new ways of treating and/or preventing them. If it weren’t for our collective understanding of genetics, therapeutics would not exist. However, credit for our modern wisdom can be traced back to the development of DNA sequencing in the 1970s when Frederick Sanger and his colleagues created the method that bears his name — Sanger Sequencing.
We’ve discussed how Deoxyribonucleic Acid (DNA) serves as the blueprint for synthesizing functional molecules in previous blogs. The makeup is important to this blog, however, so a short explanation is in order. DNA is composed of building blocks called nucleotides. These nucleotides contain a nitrogenous base that classifies them as adenine (A), guanine (G), cytosine (C), or thymine (T). Each nucleotide has a pair that it links with across two strands arranged in a double-helix. The pairs are A-T and G-C. Nucleotides within DNA are divided into groups called genes which are classified by which molecule they are encoded to produce. Genes overlap, so more can be fit using fewer nucleotides — an optimization of the genetic code. In essence, the arrangement of genes allows DNA to function by passing instructions that an organism needs to live, develop, and reproduce.
The Road to Sequencing DNA
After elucidating the double-helix structure of DNA as famously announced by Watson and Crick in 1953, scientists began to focus on developing a reliable method for DNA sequencing — the process of determining the sequential order of nitrogenous bases. It was understood that DNA held instructions for producing molecules, but knowing the sequence was essential to gain a deep understanding of living organisms.
Fred Sanger was a British biochemist whose entire life was devoted to the study of the structure of biomolecules. After decoding the sequence of the insulin protein, it took Sanger around twenty years to develop a method to find the order of bases in DNA. By 1976, two methods for sequencing were created — Sanger & Coulson’s method and Maxam & Gilbert’s. The first method utilized a chain terminator, while the second utilized a chemical cleavage.
The Sanger & Coulson method utilizes DNA replication as a means to create many, shortened copies of an initial DNA strand that can be analyzed using gel electrophoresis. Many identical copies of the initial DNA strand are heated up, causing them to become unwound. Primers are attached to these, creating a starting point for making copies. The primed DNA is separated into four different categories for analysis, one for each nucleotide to decipher. To each solution, polymerase is added along with many unbound nucleotide bases, allowing one rung of the ladder to begin naturally re-pairing with its complementary base pairs, forming phosphodiester bonds (A pairs with T, G pairs with C). Into each of the four groups, a different chain-terminating nucleotide is added. These nucleotides bind as normal, but they lack the 3’ hydroxyl group which stops any further extensions once they’re attached. Each reaction produces single-stranded DNA fragments varying in length. By passing these through gel electrophoresis, the new strands are now separated by length. Combining this knowledge with which chain-terminating nucleotide was added to each batch, you can decipher the order of the initial DNA strand’s nucleotides! The Sanger method is size-limited, so, in 1979 the shotgun sequencing technique was developed. In this method, the target DNA is broken into random fragments. These fragments are sequenced by Sanger’s technique and assembled together based on overlapping regions.
The Maxam & Gilbert method, while also successful, is more complex and requires more toxic chemicals than Sanger’s Method. Commonly known as chemical sequencing, this sequencing method utilizes chemicals to make base-specific partial cleavages in four different reactions. However, due to the ease of the Sanger method, Sanger Sequencing turned into the method of choice and a few years later it was automated. Sequencing machines were developed to sequence thousands of bases per day, fluorescent tags substituted the radioactive labelling, and capillary electrophoresis substituted polyacrylamide gels. The DNA fragments pass under a laser that activates the tag and a detector identifies the emitted color. This information is sent to a computer, producing a series of colored peaks. Software interprets each peak as a nucleotide base and generates a file with the DNA sequence.
Technological Advances in Sequencing
Building on the improvements already made to Sanger Sequencing, new methods have developed to produce longer sequences in larger volumes with greater accuracy. These methods, collectively known as Next Generation Sequencing (NGS), focus on sequence by synthesis (SBS) and multiplexing — using several different strand simultaneously. The most efficient SBS approach involves the incorporation of a single fluorescently labeled nucleotide per cycle. The reaction is imaged to determine which color was incorporated by each immobilized template. This technique was developed by Solexa, a company that was later acquired by Illumina. Other techniques like 454 sequencing (by roche), SOLiD platform (by Life Technologies), Polonator, and HeliScope single molecule sequencer were developed, resulting in huge increase of sequence data, advances in genomics, and a decrease in the per base cost of DNA sequencing. Like we’ve seen in most technologies over the last few decades — sequencing today is cheaper, faster, and more accurate than ever.
Looking for more information about Macromoltek, Inc? Visit our website at www.Macromoltek.com
Interested in molecular simulations, biological art, or learning more about molecules? Subscribe to our Twitter and Instagram!