Mechanics of Synthesizing the Human Genome: How it will happen

Sebastian Wellford
Cell Your Soul
Published in
5 min readJun 15, 2016
Image credit: Askpang

In May, a group of scientists met secretly at Harvard University to plan HGP-write, a project designed to create an entire functioning human genome out of chemicals. This is a discussion of the science and technology that will make this collaboration possible. For a discussion of the ethical implications of this project, see here.

The Human Genome Project was started in 1990 and completed in 2003. This was a systematic effort by thousands of scientists to accurately catalog all of the DNA that makes up a human being. It brought unprecedent insight into the scientific and medical communities as we now have databases available to researchers across the world.

It is important to note that artificial gene synthesis differs from cloning or PCR. It is not necessary to copy a pre-existing template when you are using gene synthesis. This allows a wide range of possibilites to create entirely new or specific strands of DNA not found in nature.

Synthesizing Bacterial Genomes

In 2008, J. Craig Venter led a team to synthesize the first artificial genome for an entire organism. Craig Venter was instrumental in the sequencing of the human genome, as his shotgun sequencing method rapidly accelerated the pace at which we could map human chromosomes. His team was able to build the entire genome for Mycoplasma genitalium, a bacterium whose genome consists of 580,000 base pairs. Before this, the largest synthesized peice of DNA was 32,000 base pairs. This is because the building blocks of DNA — the nucleotides adenine (A), guanine (G), cytosine (C) and thymine (T) — are not easy chemicals to artificially synthesize into chromosomes. As the strands of DNA become longer, they become much more fragile.

Mycoplasma genitalium

To synthesize an entire genome, the team first created 101 “cassettes” of 5,000–7,000 base pairs. Oligonucleotides (short strands of DNA) are created by adding one nuclotide at a time to a chain. The DNA must be surrounded by protecting groups to prevent any other reactions from occurring during this process. These oligonucleotides can be up to 200 base pairs in length. Then, through a process called annealing, these oligonucleotides overlap to form the longer cassettes of DNA. This step relies on the ligase and polymerase that living organisms use to build their DNA.

Next, the cassettes were joined into 25 subassemblies, each about 24,000 base pairs in length. These fragments were cloned into the E. coli bacterium, which would produce more copies of the fragments for the next steps. The fragments were extracted from the E. coli, and three fragments were combined to give segments that were 1/8th the length of the total genome. This E. coli step was repeated again to give 4 segements that could combine to make the whole genome.

At this point, the researchers found that E. coli could not clone the 1/4th-genome segments. Rather then clone and combine, they used a method called homolgous recombination. This enabled the strands of DNA to be combined within the E. coli, forming the entire Mycoplasma genitalium genome.

Once this genome was formed, the next step was putting it into a cell. Carole Lartigue and her colleagues pioneered genome transplantation in 2007, where a living cell is taken and its DNA is replaced by the DNA of the new genome. The chemically synthesized chromosomes can be used to activate the cell, essentially turning it into an entirely new organism based on the artificial genome. These cells are viable, meaning that they can reproduce indefinitely, with each new cell containing a copy of the synthesized genome.

Scaling Up

M. genitalium was used for the first genome synthesis because it was the organism with the lowest number of known genes at the time. Its genome is 580,000 base pairs, but the human genome is 3 billion base pairs spanning 23 chromosome pairs (6,000 times larger!). These genes code for over 19,000 proteins. New methods may have to be developed in order to manufacture strands of DNA this long. In addition, this DNA must be packaged into chromosomes, which was unnecessary for the bacterial synthetic genome. We must then transplant the genome into a new cell. This may be harder than it was for bacteria, especially since bacteria have no nucleus, but human cells do. Scientists will have to tackle these and other problems if they want to successfully create an entire human genome.

Researchers are confident that the technology for this undertaking is ready. Many techniques they will use have been commonplace for about 40 years. With the recent advent of new technologies such as standardized gene parts, whole-genome synthesis, and CRISPR-Cas9 editing technology, synthesizing the human genome is the next step within our reach.

The Project

The proposed HGP-write project is still in the developmental phase. At Harvard University in May, a group of 130 scientists met behind closed doors to discuss the potential for such an undertaking. The leaders of the initiative are Dr. George Church, renowned professor of genetics at Harvard Medical School and director of PersonalGenomes.org; Dr. Jef Boeke, of NYU School of Medicine and Johns Hopkins Medicine; Andrew Hessel, futurist and researcher for the design software firm Autodesk, and Nancy J. Kelly, a lawyer and executive who raises funds for medical research.

The ultimate goal of this initiative is to be able to synthesize large genomes, including the human genome, and drive down the costs of DNA synthesis. The cost of DNA sequencing decreased 1,000-fold after the initial Human Genome Project. Scientists are expecting that a similar decrease will be observed in the cost of synthesis after this project. The goal of this project is not to create an entire human, but simply establish a human cell line with a synthetic genome. The initiative will be broken down into numerous pilot projects, each focusing on about 1% of the human genome. Funding for the project is expected to launch in 2016 with about $100 million from public, private, philanthropic, industry, and academic sources around the world. The NIH has thus far refused to comment, saying that the project is too early in the developmental stages to consider funding. You can read the entire project outline in this Science article.

“What I cannot create, I do not understand.” — Words from the blackboard of Richard Feynman at the time of his death

For more posts about disease and biology, subscribe to Cell Your Soul. Feel free to comment below or message your feedback!

--

--

Sebastian Wellford
Cell Your Soul

Atoms and cells studying themselves. Virginia Tech Biochemistry Class of 2018. @WellfordBiology on Twitter.