Fundamentals
The Future of Digital Data Storage Lies in DNA
How can something that has existed much longer than mankind itself solve one of our most pressing and significant problems today?
By the end of 2025, the entire digital universe is predicted to made up of 175 zettabytes of data (1 zettabyte is 1,125,899,910,000,000 megabytes). Global demand for data has surpassed the world’s capacity to store it all. What’s the solution to this growth in data volume? Deoxyribonucleic acid (DNA) molecules!
The lack of adequate storage technology could lead to the loss of data, but DNA could counter this issue with durability, efficiency, and astounding density. By leveraging digital DNA storage, we won’t have to cautiously pick and select which mementos to keep for 100 or 1000 years — we could keep everything.
How can something that has existed much longer than mankind itself solve one of our most pressing and significant problems today?
HOW IS DIGITAL DATA CURRENTLY STORED?
Before learning about storing data in DNA, it’s important to understand how it’s stored right now, in our computers, tablets, and cellphones.
Within your electronic device, data is actually kept on a storage media, such as hard disk drives (HDDs), solid-state drives (SSDs), USB flash drives, and SD cards. Photos, documents, audiotapes, and videos are then represented as binary digits (also known as bits), which can have a value of 1 (true) or 0 (false). Letters are converted into 1s and 0s; photos to a set of numbers that reveal the location, colour, and brightness of each pixel.
Using the binary numeral system, 2 digits can represent all possible alphanumeric values. The initialism “DNA”, when converted to binary, becomes 01000100 01001110 01000001.
Computer systems capable of storing, processing, and disseminating data are housed in buildings known as data centres. These facilities are used by government agencies, educational and financial institutions, retailers, and tech giants like Google and Facebook for IT operations.
They are often resource-, energy- and cost-intensive, yet a better way to store data has existed within us all along…
WHAT IS DNA STORAGE?
Not only is DNA the oldest material capable of storing data, but also the one with the largest storage capacity. Our entire genome is encoded in our DNA, making it a natural carrier of all genetic information. But why stop there? Couldn’t DNA hold other kinds of information?
DNA digital data storage is the process by which binary data is encoded in and decoded from DNA.
Although great progress has been made with digital data storage hitherto, biology is ready to take over.
HOW IS DIGITAL DATA STORED IN DNA?
Remember how text and photos must be converted to binary values in order to be stored on computers? Well, this binary data can then be turned into the building blocks of DNA (also known as, nucleotides or bases)! By mapping bits to either A, C, G or T, data can be encoded in a sequence. Thus, 00 becomes A, 10 becomes C, 01 becomes G and 11 becomes T. If “DNA” is 01000100 01001110 01000001, it is consequently converted to GAGAGATCGAAG.
The text file of the DNA sequence is later constructed into the molecule it represents. When the data needs to be retrieved, scientists read the DNA with next generation sequencing tools and decode it into the original file.
WHAT ARE THE CHALLENGES OF DNA STORAGE?
At the moment, barriers to viable DNA digital data storage are cost and efficiency. The cost of sequencing and synthesizing digital information is still too high for any profit to be made. Additionally, there’s a need for more efficient methods to encode and decode DNA.
Partisans of this process often bring up the first commercial hard drive, invented by IBM, to recommend the prospects of DNA digital data storage. In 1956, this first hard drive weighed about a ton and held 5 MB of data, with a megabyte costing 10,000$. However, this heavy hard drive paved the path for modern storage media. Research and experiments in digital data DNA storage are pioneers for less expensive methods and more effective read-write techniques.
WHO ARE THE PIONEERS IN DNA STORAGE?
In 1959, physicist Richard P. Feynman first suggested a link between biology and manufacturing in his lecture There’s Plenty of Room at the Bottom: An Invitation to Enter a New Field of Physics.
“Consider the possibility that we too can make a thing very small which does what we want.”
In 1988, in a very early application of DNA storage, an image was stored in Escherichia coli. Since then, movies, Wikipedia, books, and the Universal Declaration of Human Rights have all been stored in and retrieved from DNA.
In 2016, a team of Microsoft and University of Washington researchers packed about 200 megabytes of data into a fraction of a drop of liquid. They were able to encode and decode books, works of art, and a song, as well as retrieve the correct sequences from a large pool of DNA and reconstruct the data without losing a byte of information.
After having encoded data into a DNA sequence, Twist Bioscience, a synthetic biology company, transformed the sequences into the molecules themselves. Using polymerase chain reaction (PCR), the team amplified the strands they wanted to recover. And when the concentration of a specific segment was increased, they took the sample, decoded the DNA, and ran error correction computations.
“We’re essentially repurposing it to store digital data — pictures, videos, documents — in a manageable way for hundreds or thousands of years.” says Luis Ceze, a principal researcher in the study and a professor of computer science and engineering at the University of Washington.
The team also took a multidisciplinary approach in their research by applying computer science principles, like error correction schemes to encode the information and random access memory (RAM) fundamentals to read the data.
“This is an example where we’re borrowing something from nature — DNA — to store information. But we’re using something we know from computers — how to correct memory errors — and applying that back to nature,”
Read their paper here.
WHAT ARE THE ADVANTAGES OF DNA STORAGE?
An advantageous attribute of storing digital data in DNA is its remarkable density. As mentioned, DNA holds our entire genetic information in an incredibly small volume. Since DNA can almost store 1 zettabyte in just one gram, 44 billion terabytes could fit into 44 grams! Thus, DNA is much more efficient than any known form of computer storage. Compared to DNA, data centres generally consume a lot of energy and can store digital data for only a brief amount of time.
To put things into perspective, most hard drives last between 3 and 5 years. DNA is very durable and could therefore store digital data for centuries! Just look at Ötzi, the Iceman. The man lived between 3400 and 3100 BCE and scientists have been able to decode his genetic information to learn about what he ate, how he lived, and diseases he suffered from, and to even uncover that he still has living relatives. After thousands of years, DNA preserved enough data to reveal more about the past.
TL;DR
- Because DNA has been storing genetic information for centuries, it’s a great storing media for digital data.
- Pictures, movies, books, music videos, and documents can be stored in DNA.
- Challenges to DNA storage are high cost and read-write latency.
- DNA is energy-efficient, stable, compact, and durable.
On November 12, 2020, biotech giants, Twist Bioscience and Illumina, and a data storage company, Western Digital, formed an alliance with Microsoft to further DNA data storage to achieve a cost-effective commercial archival storage technique and meet the demands of the growth in global digital data. Read their entire statement here.
Any research into the manipulation and functionality of DNA will advance both medicine and biology.
Follow Bioeconomy.XYZ, in order to learn more about all the ways biotech, is shaping the world around us.