Goodbye Silicon, Hello DNA. The Future of Data Storage?

As data storage needs grow more pressing, an unconventional technology offers hope

  • Share
  • Read Later
Getty Images

One night a few years ago, two biologists sat in a bar in Hamburg, discussing DNA. Ewan Birney, the associate director of the European Bioinformatics Institute, and Nick Goldman, a research scientist there, were wondering how to handle the tsunami of data flooding the institute, whose job it is to maintain databases of DNA sequences, protein structures, and other biological information that scientists turn up in their research—databases that are growing exponentially, thanks mostly to dropping costs and increased automation. The maintenance of all this data on hard drives was pressing their budget to the breaking point.

Being genomicists, they joked that DNA, which is incredibly compact, sturdy, and of course has a rather lengthy history of storing data, would be a better way to go. Joking, however, gave way to fevered napkin-scribbling, and soon, recalls Goldman, “We had to order another beer, and call for more napkins to write on.”

Three years later, the results of that bar stool inspiration have been published in Nature, in a paper in which Birney, Goldman and their collaborators report using DNA to store a complete set of Shakespeare’s sonnets, a PDF of the first paper to describe DNA’s double helix structure, a 26-second mp3 clip from Martin Luther King, Jr.’s “I Have a Dream” speech, a text file of a compression algorithm, and a JPEG photograph of the institute. You may not be storing your personal data on DNA anytime soon—the process is time-consuming and expensive, and there’s the small matter of needing a DNA sequencer to open the files—but as the costs of making and sequencing DNA continue to plunge and as computer engineering approaches the limits of just how densely information can be encoded on silicon, such biological data storage be just what’s needed for institutes and other organizations with massive archival needs.

(MORE: What’s Holding Energy Tech Back? The Infernal Battery)

To encode files in DNA, Birney and Goldman started by converting text, image, or audio data into binary code. Then, in several steps using software that Goldman wrote, they converted that into A, T, G, or C code, which stand for the four DNA bases. Working from that string of letters, they drew up the blueprints for thousands of pieces of DNA , each containing a snippet of a file, and sent their designs to Agilent Technologies, which manufactures custom DNA for biologists. Agilent sent back the completed DNA fragments—just a smidge of white dust in the bottom of a plastic tube, Goldman recalls. To open the files, the team used a standard DNA sequencer, a process that took about 2 weeks. They then used Goldman’s software to reassemble the sequenced DNA into coherent, readable files. With the exception of two small gaps in the DNA, the sonnets, photo, speech, PDF, and text file re-emerged from the white dust almost completely unscathed. After the scientists performed a little repair work, all of the information—about 739 KB worth—was retrieved with 100% accuracy.

The fidelity is impressive, and DNA, when kept in a cold, dry, dark place, can stay intact for thousands of years. But how long would you have to want to store something for this process to be cheaper than using archival magnetic tape, which needs to be replaced every 5 years but is still the current gold standard, thanks to its low power demands compared to hard drives or other storage technologies? Birney and Goldman calculate that if you wanted to put a file in storage today and have it last for at least 600 years, DNA would be cheaper than re-recording the data to fresh magnetic tape every half-decade or so, a process that would have to be repeated 120 times over the six-century span.

(MORE: The Internet of Things: Hardware With a Side of Software)

Goldman speculates that if the price of making and sequencing DNA continues to fall at current rates, commercial services that store data in DNA might spring up around 50 years from now. “You would email documents and photographs and stuff that were valuable to you and your family [to the DNA storage company], and maybe a day later or a week later, they would ship you back a little bit of DNA,” says Goldman. “You could stick it in the fridge or bury it in the garden or they would store it. And they can guarantee it will be there a hundred thousand years later.”

Birney and Goldman are not the only genomicists who have realized the data-storage potential of DNA. In September 2012, genomicists George Church, Yuan Gao, and Sriram Kosuri published a short description of a similar system in Science. The Nature team stored slightly more data, and Goldman avoided one of the sources of error in the Science paper—strings of repeated bases that DNA sequencers have trouble handling—by adjusting the way his software converts the information into A, T, G, and C. But on the whole, the ideas are similar, and represent a big step forward from earlier, smaller studies.

(MORE: Today in Time Tech History)

Still, Kosuri is quick to point out that this technology is in its infancy. “Both of our papers are pretty naïve and simplistic, in the way we encode information,” he says. “We’re not bringing to bear the 30 years or so of electrical engineering that have gone into making CDs. We’re biologists, not electrical engineers.” And even if the technique gets faster and cheaper, DNA has two limitations: it’s not rewritable, so you couldn’t update information without redoing the whole process, and doesn’t allow random access, so you couldn’t read, for instance, a single Shakespeare sonnet from the 154 Birney and Goldman stored without decoding the entire file.

No matter what, the need for something new to replace our current data-storage technology is pressing. In 2011, according to the Digital Universe report, humanity had created 1.8 zetabytes of new data to store, around 1.8 trillion gigabytes. By 2020, the number is set to have grown 50 times over. And as of this year, Moore’s Law, the observation that the number of transistors on integrated circuits doubles every two years, is expected to apply no more. Doubling is projected to occur every three years from here on out as more and more circuits compete for a fixed amount of space. Silicon may have been the workhorse of the first, golden age of computers. But it may take something even better—the very stuff that makes up life—to get us to the second.

MORE: Five Clever Gadgets From CES 2013

12 comments
krbabu
krbabu

Can anyone point out to the Agilent product that produced the DNA powder? Thanks in advance.

JackO'Keeffe
JackO'Keeffe

The use of DNA is planned to be external from the body, man-made DNA and so on. I am just wondering if anyone knows the feasibility of storing data on actual human DNA? Is it humanly possible to store data inside a person? or would it affect our genetics?

btt1943
btt1943

Using DNA to store data is certainly feasible, even if retrieval might be difficult. One would wish to use a quantum computer with DNA inside within the next decade.   (btt1943)

jobarr
jobarr

This is the coolest thing.  Ever.

krsmith
krsmith

I had a similar thought as @keithwms as I was reading this article. I’m not as knowledgeable about the subject but I remember encountering a chapter about this same topic while reading Biomimicry: Innovation Inspired by Nature by Janine M. Benyus. Within this chapter she had interviewed a scientist working on DNA data storage, even building entire computers from DNA. And this book was published in 2002, reinforcing @keithwms’s point that this research and technology has been in the works for quite some time.

So I am curious as to why a product which is apparently in high demand, according to the article, has been in the research phase for at least a decade. I am also interested in what factors would pull the DNA computing technology out of this phase and into the hands of the public? The article mentions the technology is not quite user friendly which is of one of the top priorities of consumers, but what else is holding it back? With the anticipation of the future in sight, can society expect this to be the next big thing in technology or is it still far from being developed into an everyday product?

This is not the only biomimetic technology that seems to not quite make it out of the idea and research phase.  With immense prominence in the environmental community, biomimicry has the potential to provide solutions using nature as a blueprint. Michael Pawlyn, an architect focused on creating environmentally sustainable designs, presented his ideas about how we could learn something from nature in a Ted Talk entitled Using Nature’s Genius in Architecture. Biomimicry is capable of impacting social interests from architecture to computing even though the problem exists that only individual people and small minorities of scientists and researchers are investing in this field of study.  By dispersing this knowledge and making products and resources increasingly available to the the public then we will see biomimetic technology leave the idea stage and have a greater influence in our society. 

keithwms
keithwms

Sorry but this is *very* old news. People have been considering DNA for data storage and Turing-computer constructs for at least a decade or two. When I did a postdoc in a related field in 2001, there were already conferences and lots of papers on this subject. There was a surge of interest associated with the DNA computing aseertions by Adleman et al, but bear in mind that those proposals first emerged around the mid 90s. Then there was the "traveling salesman" solution and then another explosion of interest due to the marvelous constructs assembled by Ned Seeman at NYU, the nanop[article assembly by Chad Mirkin at NWU, etc etc, and there have been hundreds if not thousands of other papers on the biokleptic (as Ned would put it) theme of using DNA to encode information, and more recently a lot of that interest morphed into combinatorial arrays for genomics and work on DNA analogues like PNA (which is where I got involved).  

So, I don't mean to sound negative but... welcome to a rather well established field. To say this field is in its infancy is completely absurd; peopel have been working for decades to wield and adapt the elegant machinery of the cell for these kinds of purposes.

imtwk
imtwk

Great - use your DNA and then catch a viurs or a worm - so regular flu is not enough - what a prospect!!!

DwightJones
DwightJones

Anyone interested in pooling their DNA? Join a species collective dedicated to harmonizing with the planet for the next 1000 summers?

AshleyLawton
AshleyLawton

@JackO'Keeffe  Yes, it would affect our DNA. Just like a cold virus inserting it's DNA into our lungs hijacks our systems to create new viruses. 


We could hypothetically store data within someone if the DNA was not within our own cells so there was no crossover between information. We have billions of bacteria living in our digestive system.

KvnL
KvnL

@keithwms If I may ask In what related field you've done a postdoc?  I am (just) a student in biomedical science - But would it be possible to make a reading list as an introduction into the matter? (Especially with regard to Non-Silicon Non-Binary Computing - transmission and storage of information)  

keithwms
keithwms

P.S. please pardon all my typos! Don't see how to edit this... at least DNA has good error correction....