As the modern world produces ever more data, researchers are scrambling to find new ways to store it all. DNA holds promise as an extremely compact and stable storage medium, and now a new approach could let us write digital data directly into the genomes of living cells.
Efforts to repurpose nature’s built-in memory technology aren’t new, but in the last decade the approach has gained renewed interest and seen some major progress. That’s been driven by an explosion of data that shows no signs of slowing down. By 2025, it’s estimated that 463 exabytes will be created each day globally.
Storing all this data could quickly become impractical using conventional silicon technology, but DNA could hold the answer. For a start, its information density is millions of times better than conventional hard drives, with a single gram of DNA able to store up to 215 million gigabytes.
It’s also highly stable if properly stored. In 2017, researchers were able to extract the full genome of an extinct horse species from 700,000 years ago. Learning to store and manipulate data using the same language as nature could also open the door to a host of new capabilities in biotechnology.
The main complication lies in finding a way to interface the digital world of computers and data with the biochemical world of genetics. At present this relies on synthesizing DNA in the lab, and while costs are falling rapidly, this is still a complicated and expensive business. Once synthesized, the sequences then have to be carefully stored in vitro until they’re ready to be accessed again, or they can be spliced into living cells using CRISPR gene editing technology.
Now though, researchers from Columbia University have demonstrated a new approach that can directly convert digital electronic signals into genetic data stored in the genomes of living cells. That could lead to a host of applications both for data storage and beyond, says Harris Wang, who led the research published in Nature Chemical Biology.
“Imagine having cellular hard-drives that can compute and physically reconfigure in real time,” he wrote in an email to Singularity Hub. “We feel that the first step is to be able to directly encode binary data into cells, without having to do in vitro DNA synthesis.
“This is perhaps the hardest part of all DNA storage approaches. If you can get the cells to directly talk to a computer, and interface its DNA-based memory system with a silicon-based memory system, then there are lots of possibilities in the future.”
The work builds on a CRISPR-based cellular recorder Wang had previously designed for E. coli bacteria, which detects the presence of certain DNA sequences inside the cell and records this signal into the organism’s genome.
The system includes a DNA-based “sensing module” that produces elevated levels of a “trigger sequence” in response to specific biological signals. These sequences are incorporated into the recorder’s “DNA ticker tape” to document the signal.
In this new work, Wang and colleagues adapted the sensing module to work with a biosensor developed by another team that reacts to electrical signals. Large populations of the bacteria were then placed in a device made up of a series of chambers that enabled the team to expose them to electrical signals.
When they applied a voltage, levels of the trigger sequence were elevated and recorded into the DNA ticker tape. Stretches with high proportions of trigger sequence were used to represent a binary “1” and their absence a “0,” allowing the researchers to directly encode digital information into the bacteria’s genome.
The amount of data that a single cell can hold is pretty small, just three bits. So the researchers devised a way to encode 24 separate populations of bacteria with different 3-bit chunks of data simultaneously for a total of 72 bits. They used this to encode the message “hello world!” into the bacteria, and showed that by sequencing the combined population and using a specially-designed classifier, they could retrieve the message with 98 percent accuracy.
Obviously 72 bits is a long way off the storage capacity of modern hard drives, and even cell-free DNA storage techniques now deal in gigabytes. But Wang says this is just a proof of concept, and there is plenty of scope for boosting the efficiency of the CRISPR machinery that powers the recorder, the length of the ticker tape that can be reliably read, and even the electronics used to encode the data.
“All of these things are going to improve over the next few years and I definitely think it is possible to massively scale up the capacity of the system by several orders of magnitude even in the short term,” he said.
And storing data in cells rather than in vitro has a number of significant benefits, he added. For a start, it’s much cheaper to amplify or duplicate the data because you can simply grow more cells rather than having to carry out complex artificial DNA synthesis. In the paper the team showed that the recorded information remained stable for between 60 and 80 generations of cells.
Cells also already have a native capacity to keep DNA safe from environmental disturbances. They demonstrated this by adding the E. coli cells to unsterilized potting soil and then reliably retrieving a 52-bit message by sequencing the combined soil microbial community.
Perhaps most exciting, though, is the possibility of coupling this data recording ability to emerging research on biocomputers. Researchers have already started to engineer cells’ DNA to allow them to carry out logic and memory operations, but creating a direct interface between silicon and genomes could significantly accelerate our ability to reprogram cells for our own devices.