A Beginner’s Guide To Gene Editing
“You get what you get and you don’t throw a fit.”
Everyone has heard this saying at least once in their life. I remember growing up with my mom applying the quote to everyday life. Whatever color my teacher gave me from the crayon box was the color I got, no changing it.
This saying seems like it would apply to genetics. I was told that your DNA determined all the traits you had from what your food preference was all the way to how you looked. I was born with long black hair, brown eyes, and tastebuds that love spicy foods; those were the cards dealt to me. DNA seemed to be like a genetic lottery with people getting stuck with whatever genes their parents gave them.
This applied to hereditary diseases as well. If a family member had cystic fibrosis or sickle cell it meant there was a chance you had it too. There wasn’t anything that could be done about those situations. It was out of our control.
Is it really out of our control?
No, we can take the cards we were dealt and change them.
We can’t possibly start to get into the idea of gene editing without understanding genomics. A genome is the complete set of DNA(deoxyribonucleic acid)in an organism, and genomics is the study of genomes. To begin, we must first learn what DNA is. DNA is a nucleotide made of a sugar, a phosphate group, and a nitrogenous base.
The nitrogenous bases are called adenine, thymine, guanine, and cytosine(commonly known as A,T,G,C). Bonds form between DNA molecules with adenine binding with thymine, and guanine binding with cytosine. Think of the bases as a couple, A will only match with T and G will only bond with C. These create what we call base pairs. The DNA in our bodies are long chains of these base pairs(about 6 billion).
RNA(ribonucleic acid) is similar to DNA. They have similar structures, but it has one more oxygen atom in it’s sugar and phosphate backbone, and instead of having the nitrogenous base thymine it has uracil. Additionally, it is not a double helix shape like DNA.
In school we are all taught about diversity and how every individual’s DNA is completely different, however, in actuality humans are 99.9% genetically identical. The 0.1% is the reason why people have get different appearances, traits, and health. It seems very small, but there are 3 million differences within that 0.1%. These are where variants, alterations in the most common DNA sequences, are going to be found.
This is key information because genes are made up of DNA. Genes contain the instructions for producing certain proteins. Genes vary in size from hundreds of base pairs or millions. Chromosomes are made from many genes. Humans have 23 pairs of chromosomes from each parent totaling to 46 total.
The most common way genes create proteins is when DNA creates RNA which makes new proteins. This is through transcription and translation
There are three steps to transcription:
In initiation an enzyme called RNA Polymerase signals to the DNA to unwind so it can read the bases in the DNA strand.
The RNA Polymerase starts creating the RNA during elongation where the complementary base is added to the RNA(ex. if the DNA strand had a cytosine base the RNA would be built with a guanine base note: uracil binds with adenine)
The final step to transcription is termination where the RNA Polymerase reads a stop sequence. This means the RNA strand is complete and it detaches from the DNA.
Translation happens in the ribosome where the RNA is read in groups of three until it reaches a group of three that signals a stopping point. The different groups of three produce different amino acids which creates a polypeptide chain. When it reaches the stop signal the polypeptide chain is finished creating a protein.
Now that’s the basic about DNA, RNA, and genes. Another core concept to genome experiments is sequencing. Sequencing finds out what genetic information is contained in a strand of DNA. One of the more commonly used methods is the Next Generation Sequencing(NGS) method. This method is popular due to the speed at which this process is done. There are 3 steps:
A DNA sample is taken and put into a sequencing instrument. High frequency sound waves in the instrument break the DNA into smaller pieces that are about 600 base pairs long. To finish prepping the segments, special tags are added to the end of the DNA. These tags are able to stick to a glass slide, and it allows the different segments to identified so eventually they are able to be reassembled into the beginning strand of DNA.
Note: Before sequencing begins there is a mini step where the DNA segment is replicated due to it’s small size it can’t be read with just one strand. This is done through a polymerase chain reaction(PCR) to create copies of the DNA.
A sequencer begins to read the DNA one base at a time using colored tags for bases(this is done to all copies of the DNA fragment). The color pattern that is shown is put into a computer which can then write out what the sequence is.
The data is put through a computer that is able to piece the tags together to create the original DNA strand. People can look over the DNA strand and look for any variants in the sequence the computer put together. If there are variants or discrepancies scientists then try to figure out which genes they are in, and what they mean.
DNA sequencing gives us useful information because it highlights changes in a gene that may cause disease or other effects.
There are many different types of variants that can be found. A lot of these variants are self explanatory. Types of variants include the following:
This variant replaces one DNA nucleotide with another nucleotide. They can be classified a step further based on the effect they have on the production of protein. Missense variant means the change causes one amino acid to replace another which may alter the function of the protein being made. Nonsense variant causes the stop signal to be read before the complete production of the protein. This leads to the protein breaking down, functioning improperly, or being nonfunctional.
The DNA is different because it adds one or more nucleotides to a gene, so the protein made won’t function properly.
The deletion variant is when there is one or more nucleotides are missing. There are large deletion variants where entire genes or several genes are missing, so protein created from the DNA can alter the function of the protein.
This variant is unique because it still contains all of the same nucleotides, but multiple nucleotides are changed. The original sequence is replaced with the sequence in reverse order.
After all of that background knowledge we can finally begin to apply the information by learning about gene editing! Gene editing is the manipulation of genetic material in organisms.
There are different techniques of gene editing some well known ones are:
- Zinc Finger Nucleases(ZFN)
- Transcription Activator-Like Effector Nucleases(TALENs)
- Clustered Regularly Interspaced Short Palindromic Repeats(CRISPR)
Each of these methods have different benefit and do different things to DNA.
Zinc Finger Nucleases(ZFN)
Zinc Finger Nucleases were the first programmable nuclease. A nuclease is an enzyme that breaks large chains of nucleotides into smaller units. They are made from endonuclease and zinc fingers. ZFN’s can be thought of as a type of artificial restriction endonuclease. A restriction endonuclease(nuclease) is an enzyme that will go along to a specific sequence and cut the DNA at a point. A zinc finger is a protein that identifies and binds with certain parts of DNA(codon or 3 nucleotides). ZFN will recognize a specific sequence of bases in DNA and will cut. They are synthetic, but they are made with biological molecules.
The concept is to have a lot of zinc fingers which recognize a specific codon(ex. only ATA). This will create a large structure that will only bind if it identifies a specific sequence(think of three ovals, the most common number of proteins in ZFNs). This is repeated for the other side of the DNA(DNA is double stranded) with another set of zinc finger proteins. This creates a section in between that can be cut off using nucleases. To make this happen each set of zinc finger proteins would be attached to half of a nuclease. These nuclease is called Fok1.
In general, ZFNs need 18 bases to be correct, 9 “on top” of the DNA and 9 “on the bottom” of the DNA, and 6 bases in between. Fok1 needs 6 pairs of bases to cleave to. It cuts in a staggered manner creating 4 nucleotides of sticky ends.
By removing the DNA it becomes mutated, specifically it is deletion mutation.
ZFN’s seem to be very specific, but there are some toxicities associated with them. Toxicities are when there is harm to the genetic information that can cause a cell to misfunction or cause the cell to die. This happens due to the 18 base pair sequence a ZFN reads. Theoretically, it should be so specific that there is only one site that matches the base pairs, but multiple areas seem to be cleaved. ZFNs can recognize variants of the site it is designed for, which means it cleaves to multiple sites. These are what is called ‘Off-Targets’.
ZFNs with a greater affinity(or bond/attraction) to their bond sites are less toxic than those with a smaller affinity.
Another issue with ZFNs stems from their lack of flexibility. It is hard to produce ZFNs that recognize every single codon.
ZFNs aren’t as common used today due to better techniques being produced.
Transcription Activator-Like Effector Nucleases(TALEN)
Transcription Activator-Like Effector Nucleases(TALEN) are artificial nucleases that cut DNA at very specific sites. TALENs are made out of transcription activator-like effector proteins(TALE), which are proteins that are secreted by Xanthomonas bacteria, and nucleases. TALEs bind to a specific nucleotide(singular A,T, G, or C). This is more specific than zinc fingers because zinc fingers bind to three bases.
TALEs possess similar repeats that are used for DNA-recognition. When you look at a TALE they are a series of amino acids(33–35). It is a repeating array. A pair of amino acid that are in the 12th and 13th position of each repeat determines the nucleotide specificity(aka what protein it will cleave to).
If you attach loads of TALEs together it creates a big protein(typically 9 are attached together). This can then bind to a sequence of nucleotides. Then half of a endonuclease(FOK1) is attached to the TALE chain. It is inactive by itself so another half of FOK1 must be there so it can become an active enzyme.
Another TALE chain is created to match another set of bases. This gets attached to half of FOK1, so when the halves of FOK1 meet it cuts 6 base pairs in between the two TALE chains. It cuts in a staggered manner similar to ZFN leading to the creation of sticky ends.
The proteins and the endonuclease combined are what is called the TALEN. It is going to be very specific because it requires a specific combination of (normally) 18 nucleotides.
These are more specific than ZFN, so it is easier to work with when targeting a specific part of DNA. Due to this, off-targets are not as big of a concern with TALENs, but they are limited to a single mutation. This means they target only one site at a time. It can only help genetic diseases that effect only one gene. They are very large due to the 18 total proteins, so it can be more difficult to deliver TALENs into cells.
Overall, TALENs are very similar to ZFNs, but it shines in places where ZFNs couldn’t.
Clustered Regularly Interspaced Short Palindromic Repeats(CRISPR)
CRISPR is the most commonly known method that most people have known of. It was one of the most revolutionary techniques created. It shrunk the cost of gene editing by 99% and takes less time. If the other methods were like a map CRISPR is like a GPS.
CRISPR is an adaptor immune system found inside bacteria. It fights off viruses by detecting viral DNA and destroying it. Cas9 is a protein found within the system that is able to find and cut out viral DNA. It is an endonuclease that can recognize DNA that match a 20 nucleotide sequence in a guiding RNA.
When viruses inject a cell they inject their DNA into the cell. With CRISPR the DNA gets removed from the virus and inserted into the DNA of the bacterium. The little bits of DNA get inserted on a site called CRISPR. CRISPR records what viruses they have been exposed to, and then this gets passed on to many generations of cells. The site integrates them into the array. In the array, this creates a pattern of repeats in between the viral DNA.
When the tiny bits of viral DNA are in the chromosome of a cell, RNA that is an exact copy of the viral DNA is produced. The RNA gets broken into smaller pieces with each sequence containing a sequence of viral DNA and a sequence of the repeat. The bits of RNA then bind with another RNA molecule called tracer RNA. This structure then binds to the Cas9 protein. Then the protein goes through the cell’s DNA to find any sites that match the RNA. If a match is found the Cas9 cleaver then cuts out the viral DNA. This then leads to the degradation of the viral DNA. This is a double-stranded break.
The sentinel complex of Cas9 and the RNA is programmable. Cells can detect when they have broken DNA and can fix it in two ways by pasting together the ends of DNA or by placing a new piece of DNA at the site. By causing a break at a certain site the cell can be triggered into fixing the mutation.
There are two main ways DNA stands can be repaired after a double stranded break: Nonhomologous End Joining(NHEJ) and Homologous Recombination(HR).
NHEJ is the primary pathway for the repair of double stranded breaks. It relies on a protein called Ku which thread onto the broken ends of the DNA. The goal of Ku is to make the DNA ligase join the two fragments of DNA together into one strand. If Ku recognizes the DNA needs to be cut it will recruit nucleases, and if Ku recognizes the DNA needs to be filled then it will recruit polymerases(can form nucleotides). This is a quick process that usually takes about 10 minutes.
HR is another commonly used pathway. It only works if there is an exact copy of the chromosome that is broken. RPA proteins will attach to the terminal ends of the DNA. It then signals proteins like BRCA2, RAD51, and RAD52 to come to the site. These proteins will associate themselves with the terminal ends of the DNA.
RAD51 and RAD52 will guide the broken DNA to the copy(only a portion of entire sequence). The broken ends of DNA will then start to get new nucleotides to match the copy. After that, endonucleases will come in and cut the exact copy of DNA from its sequence. This creates three cut sequences of DNA. Two are the ones that are the repair of the broken DNA, and one is going to go be apart of the sequence the copy DNA was cut from. Finally, ligase comes and join all the DNA.
To harness the Cas9 protein the CRISPR systems had to be analyzed more. There are two classes of CRISPR systems class one and class two. Class one include multiple genes and Cas proteins that have to be present in the cell, and have to assemble with CRISPR RNA’s to form surveillance complexes. Class two systems include a single gene that encodes one large protein that binds with CRISPR RNA.
Watch this YouTube video that shows how the Cas9 changes its structure
Cas9 changes it’s structure when it is bound to the RNA. A rotation happens that opens a channel in the center where the guiding RNA lies. Then when DNA is introduced a domain in the protein called HNH rotates toward the center. HNH is the one of the endonuclease domains that cuts a DNA strand, and RuvC is the other endonuclease domain that cuts the other DNA strand.
CRISPR is the cheapest, fastest, most accurate, and efficient genome editing technique to date, but there are some limitations. Just like with the other methods there is a slight chance for off-target effects and the future consequences for generations the experiment has are unknown. The biggest concern people have with CRISPR is the double strand break. Double strand breaks can be dangerous because it shatters the integrity of the DNA. It can lead to some serious consequences.
Although gene editing solves many problems, it has a lot of controversy surrounding the experiments and results. It can be easy to just shrug these concerns off, however, some of these points call into the question at what point does the cost of gaining knowledge about potential benefits get outweighed by risks. Most debates about gene editing can fall into 3 categories: Safety, Equality, and Embryo Testing
People are very concerned about the side effects from gene editing. Things like off-targets can cause DNA to be edited in the wrong place or inconsistently among cells. One mistake can lead to consequences that effect future generations.
There is a certain amount of unpredictability with gene editing. This causes fear in the minds of the public.
There is a large concern with the actual usage of gene editing. There has always been this concept of editing babies ever since the idea of gene editing had been introduced. People worry about living in a world where babies can be genetically modified to have “better traits”. The thought of this being in the future is unsettling to many people.
With any new medical discovery the discussion over who it will have access to it is always present. Although CRISPR has lowered the cost of gene editing it is still very expensive. This limits the treatment being available to anyone who isn’t in the upper class. Gene editing widens the divide between the poor who can’t get medical treatment and the rich that can afford it.
Using embryos is always controversial. People can have moral and/or religious reasons for not wanting testing to be done on embryos. There can be unintended side effects to editing genes. This can harm an embryo. There is an issue about consent with embryos and children as well. It is hard for parents to understand how severe the consequences to gene editing can be, so when parents consent some people believe it isn’t truly consented.
Opponents of gene editing fear that without proper guidelines embryos and children can get taken advantage of and tested on without thorough research. This fear started to come into reality when a Chinese scientist, He Jiankui, used CRISPR to edit the genes of twin girls. The argument about potential dangers and benefits are still heavily debated.
Those are the basics of gene editing! Gene editing is revolutionary to the medical community. It opens up the possibility to help cure genetic disease, cancer, and viral infections. I believe it is the next step in science that will change the way we look at diseases. No longer will people have to be told that nothing can be done about their genetic disease. The human body will be capable of building a strong immune system that can spot cancer sooner or fight viruses more efficiently. The different possibilities are endless!
I can’t wait to see where gene editing will take us!