The idea of storing digital data in DNA seems like science fiction. At first glance, it might not seem obvious that a molecule can store data. The term “data storage” conjures up images of physical artifacts like CDs and data centers, not a microscopic molecule like DNA. But there are a number of reasons why DNA is an exciting option for information storage.
The status quo
We’re in the midst of a data explosion. We create vast amounts of information via our estimated 17 billion internet-connected devices: smartphones, cars, health trackers, and all other devices. As we continue to add sensors and network connectivity to physical devices we will produce more and more data. Similarly, as we bring online the 4.2 billion people who are currently offline, we will produce more and more data.
Oftentimes we also want to store data for longer-term purposes. These timeframes might exceed the capabilities of current storage technology. For example, we might want to store family photos and videos such that our descendants 100 years from now might be able to view and interact with them. We might want to pass down cultural relics, family recipes, or technical know-how to future generations.
Our current data storage methods are struggling to keep pace with our demand for storage capabilities. A data storage crisis would be incredibly stifling for human development. So, we need new robust and sustainable solutions for both short-term and long-term data storage.
A panel at SynBioBeta SF 2016 with representatives from Intel, Gen9 and Semiconductor Research Corporation discussed the current state of DNA data storage. One major takeaway from the discussion is the fact that the long-term data storage market is the fastest growing segment of the data storage market. Moreover, all panelists seemed to agree that demand for DNA data storage will be driven primarily by the need for a better solution that transcends the limits of silicon-based storage systems.
Why would we store digital data in DNA?
DNA is nature’s information medium. In fact, we call DNA the “blueprint of life” precisely because it contains the recipes that guide cells in making proteins. These proteins enable all aspects of life, from digestion to movement and from growth to fighting diseases.
So, DNA already encodes information — “biological recipes,” if you will. Thus, the idea with DNA data storage is to repurpose that information storage capacity so that we can store our digital data — our selfies, movies, and documents — in DNA. To do this, it is necessary to first translate digital info into biological info.
The major reasons for using DNA are:
1. Eternal relevance: As long as there is DNA-based life on Earth, DNA will be relevant. Conventional methods of data storage will always be superseded by new technology, so if we use conventional data storage, we will always need to transfer data to the new, better systems.
2. Stability: DNA seems able to withstand some degree of environmental stresses. In 2013, scientists read the DNA derived from a 700,000-year-old horse fossil. This suggests that a DNA-based storage system will last longer than hard disks and tapes.
3. High storage capacity: The storage potential for DNA vastly exceeds that of all other media. Some experts estimate that all the world’s data could be stored in one kilogram of DNA — an incredible proposition.
Where are we today?
Earlier this year researchers at Microsoft and the University of Washington broke the record for storing digital data in DNA. They managed to store and retrieve 200 megabytes of information (including high-definition video, multiple books and articles as well as a database) using DNA provided by Twist Bioscience.
Storing 200 MB represents a huge leap from the previous record of 0.74 MB achieved in 2013. This is great progress, and it highlights the fact that more interest is being devoted to this endeavor. However, the current cost of DNA data storage is not attractive.
Storing digital data in DNA involves both reading and writing DNA. While the price of reading DNA (DNA sequencing) has fallen sharply, the price of writing DNA (DNA synthesis) currently remains prohibitively high for data storage. New companies like Gen 9 and Twist Bioscience have emerged with new methods that allow for cheaper, faster DNA synthesis.
However, greater cost reductions are needed in this regard in order to accelerate DNA data storage.
What needs to be done?
To make DNA data storage a commercial reality, we need to:
- Develop new and better ways of translating digital information into biological information; ways that enable fast, accurate and cost-efficient retrieval of information.
- Invent and advance new chemistries to enable cheap DNA synthesis.
- Incorporate more automation in production workflows to achieve cost reductions.
Open questions
Because this field is very young, there are several open questions that have yet to be answered:
1. How do we design for security? Today, very little will stop a skilled, dedicated, and patient hacker from accessing and stealing confidential information. If we are going to design a new data storage system, it should be more secure than the current paradigm. We need to think seriously about designing for security from the outset.
2. What will the user interface look like? The user interface of a new technology often influences whether or not that technology will be adopted en masse. How we will interact with DNA data storage technology remains unanswered. In the future, will we all have DNA sequencers, DNA synthesizers, and algorithms that translate digital data into biological data in our phones, our homes, or our local community biohacker spaces? Or will these capacities be restricted to companies? In either scenario, how easily we can interact with DNA data storage technology might affect how quickly we adopt this technology.
3. How will the world receive this? Today, there are pressing debates about consumer privacy and biotechnology. In the wake of the Snowden revelations, many are paranoid our data is accessible without our permission. In addition, many are generally apathetic towards biotechnology. Perhaps there is an opportunity to create a world in which consumers can store some of their own data via DNA instead of using centralized data centers.
While some will welcome the transition from magnetic storage to DNA data storage, it is likely that others will be uneasy with this, citing their distrust of biotechnology as a reason. Considering many are unaware of the processes that currently store their information, should future consumers even be told that their information is stored in synthetic DNA? Or will consumers be indifferent about the storage medium?
People’s answers to these questions are likely to vary with their location in the world.
4. What kind of information do we want to store using DNA? Archival data that we would want to access less frequently, like messages we may want to pass to future generations of humans, or more frequently-accessed data like our selfies and Netflix movies?
The DNA-for-data-storage scene is quite nascent. Earlier this year, Helixworks announced a DNA data storage system that can be bought off Amazon. Their system can store up to 512 kB, which according to Helixworks is enough to “store a small photograph, a poem, a love-letter, a eulogy or a bitcoin wallet.” A number of groups have recently formed to attempt new solutions: Edinburgh’s 2016 Undergraduate iGEM team as well as Catalog, a new entrant in IndieBio’s fourth cohort of biotech startups in San Francisco.
As these groups continue to develop their technologies, we will start to get a clearer picture of their implementation strategies. Until then, it is too early to make predictions with certainty.
Image credit: Shutterstock