What We Are Going to Do With All That Data

A 3 billion-year-old answer to a very modern problem

Published in

The Startup

5 min readJan 14, 2020

Lots of data generated every night. The center of the Milky Way galaxy imaged by NASA’s Spitzer Space Telescope Image credit: NASA/Ames/JPL-Caltech. Public domain.

It’s no secret that we are generating far more information than we can possibly store — more than 2.5 quintillion bytes per day. When you click an aging link and get that “404 file not found” message, you likely tried to access old information vaporized to make room for something newer. And “aging” may mean a few months old. Even with the insanely cheap cost of modern data storage, we can’t keep everything. When you think about storing all this information on the scale of a decade or century, the problem is staggering.

Our solution to storing massive amounts of information may be a 3 billion-year-old technology: DNA.

DNA has a couple of inherent advantages over silicon. Its basic unit, the nucleotide, encodes 4 bits and has dimensions of about 1 nanometer (a millionth of a millimeter). Silicon transistors, by contrast, encode only 2 bits and can’t get much smaller than 10 nanometers.

DNA information density is about one exabyte (a billion GB) per cubic millimeter. This is a billion times more dense than the most advanced technologies available today. A hectare’s worth of storage tapes (20,000 m³, a very large warehouse) could be reduced to 20 cubic centimeters, a volume smaller than an iPhone.

What We Are Going to Do With All That Data

A 3 billion-year-old answer to a very modern problem

Written by Drew Smith, PhD