The modern world is facing a tsunami of data. DNA is emerging as an ultra-compact way of storing it all, and now researchers supported by Microsoft have created the first system that can automatically translate digital information into genetic code and retrieve it again.

In 2018 we created 33 zettabytes (ZB)—33 trillion gigabytes—of data, according to analysts at IDC, and they predict that by 2025 that figure will rise to 175 ZB. It’s been estimated that if we were to store all our information in flash drives, by 2040 it would require 10 to 100 times the global supply of chip-grade silicon.

DNA, on the other hand, is so compact it could shrink a data center to the size of a few dice. But for that to become practical we need a DNA-based equivalent of a hard drive that lets you upload and download data in a simple and intuitive way.

Scientists have already demonstrated their ability to store everything from text to videos in DNA, but the process still requires a lot of manual intervention. “You can’t have a bunch of people running around a data center with pipettes—it’s too prone to human error, it’s too costly and the footprint would be too large,” lead author Chris Takahashi, senior research scientist at the University of Washington (UW), said in a statement.

So the researchers designed a desktop-sized device that carries out the entire process automatically. First, software converts digital data into the four DNA bases—the letters A, T, C, and G—that make up the individual building blocks of the genetic code.

The device then adds the required chemicals to a synthesizer to build the snippet of DNA and then stores it in a special vessel. When it’s time to read the data back out again, microfluidic pumps push the sample into a sequencer, where the genetic code is read before the software converts it back into 1s and 0s.

There are quite a few caveats, though. For a start, they only stored the world “hello,” which represents just five bytes of data. And in a paper describing the research in Scientific Reports, they say it took 21 hours to write the data and read it back out.

The device also costs roughly $10,000, and there’s no discussion of the cost of the precursor materials. For reference, Twist Bioscience charges between seven and nine cents per base, and you’d likely need thousands or even millions to store a few megabytes of data.

But DNA technology is moving quickly. Sequencing the first human genome cost $2.7 billion and took 15 years, but just 20 years later private companies will do it for under $1,000 in a matter of weeks.

And Microsoft isn’t the only company working on DNA storage. Intel and Micron are also funding research, and last year MIT spinoff Catalog revealed they are building a machine the size of a couple of shipping containers that will be able to write a terabit of data into DNA per day sometime this year.

Their approach promises to be more cost-effective, because rather than directly converting 1s and 0s into specially-synthesized DNA strands, they will use enzymes to arrange cheaper pre-made strands into larger molecules whose patterns encode the relevant data. A lack of detail has made it tricky to assess the viability of the idea, though.

Data storage is also not the only aspect of the digital world where DNA could play a greater role. A day before the UW research was published, scientists at University of California, Davis revealed the first reprogrammable DNA computer in a paper in Nature.

It’s not the first time DNA has been used to carry out computation, but previously the DNA hardware had to be designed specifically for the task at hand. This time the researchers created hundreds of DNA strand building blocks that can be combined to create circuits that implement 21 different algorithms for simple tasks, like generating patterns or counting.

The technology isn’t going to replace silicon computers anytime soon, but could be used to carry out computation at a molecular level. That could involve directing the activities of nanoscale factories that assemble molecules or tiny DNA robots used to deliver drugs, Petr Sulc, an assistant professor at Arizona State University, told Wired.

The UW scientists are also looking at introducing these kinds of computational capabilities into its DNA systems. They have developed processing techniques that allow them to use interactions between the molecules to directly search for data like images in the DNA without converting it back into digital format.

They say their next steps will be to combine their new data storage device with these kinds of capabilities, as well as more advanced methods for mixing liquids that can move single droplets around using a grid of electrodes.

Image Credit: CI Photos /

I am a freelance science and technology writer based in Bangalore, India. My main areas of interest are engineering, computing and biology, with a particular focus on the intersections between the three.

Follow Edd: