A Living Hard Drive: This GIF Was Stored in the DNA of Bacteria

DNA is a hugely promising medium for storing data. Consider that a cell nucleus can hold the instructions for an organism as complex as a human. So far efforts to store non-genetic data in DNA have been carried out in test tubes, but now scientists have encoded a GIF into the genome of living bacteria.

The scientists from Harvard University used the CRISPR genome-editing tool to store a picture of a hand and an animation of a running horse adapted from Eadweard Muybridge’s 1878 photographic study Human and Animal Locomotion in the genome of E. coli bacteria.

More importantly, they were able to retrieve the image of the hand perfectly and the GIF with 90 percent accuracy by sequencing the bacterial genomes. Their results were published in the journal Nature on Wednesday.

Image Credit: Wyss Institute at Harvard University

Efforts to store unconventional data in DNA have been going on for years thanks to DNA’s incredible compactness and long shelf life. Properly stored, it can keep data intact for at least 100,000 years. Just a couple of months ago Microsoft said it planned to incorporate a DNA storage system in one of its data centers by the end of the decade.

Typically, though, this has been done by translating the bits that encode books, images or audio recordings into DNA sequences and then synthesizing them artificially. By using CRISPR instead, the Harvard team, led by renowned geneticist George Church, was able to hijack the genomes of E. coli bacteria to store the information.

The CRISPR system is actually a natural defense mechanism that bacteria use to develop immunity to invading viruses by recording snippets of the attacker’s DNA in the bacteria’s genome. These snippets are then used to guide the enzyme Cas9 to find and destroy invasive DNA next time the virus attacks.

The CRISPR/Cas9 system has been re-purposed by scientists to edit genomes by re-engineering it so it chops DNA sequences at a specific location. This then allows them to remove existing genes or add new ones.

In this new study, though, the researchers instead re-purposed the lesser-known Cas1 and Cas2 proteins responsible for inserting viral DNA into the bacteria’s genome. “We found that if we made the sequences we supplied look like what the system usually grabs from viruses, it would take what we give,” Seth Shipman, a neuroscience researcher at Harvard and study co-author, told The Verge.

Importantly, Cas1 and Cas2 insert new pieces of DNA in the order they arrive, which is what made it possible for the researchers to encode an animation. The data was actually encoded in 600,000 cells to help boost accuracy because the process is not precise, but modern sequencing tools mean it’s fairly quick to retrieve the data.

The amount of data stored in the cells is considerably less than what’s been achieved with the synthesis route. Last year researchers from Microsoft and the University of Washington stored 200 megabytes of data in a smear of DNA smaller than a pencil tip.

That means the approach is unlikely to supplant synthesized DNA for the kind of long-term data storage that has piqued the interest of IT firms. But the ability to record data directly into a cell’s genome does open up a host of new potential applications.

The one the researchers themselves are most interested in is the prospect of turning cells into recording devices that can track changes in both their internal workings and their environment over time. They think this could help us to understand the developmental processes that govern how neurons morph into specialized cells over time or help track which neurons are talking to each other.

Further into the future, it may be possible to effectively create “black boxes” for cells in the human body, Church told The New York Times. Bacteria could be made to record the activity of cells over time, and when someone gets ill doctors could extract the bacteria and sequence their DNA to play it back.

It’s also possible to imagine the approach could be a useful new tool for synthetic biologists who are already using gene circuits to build tiny computers inside cells that can carry out logic functions by providing a form of memory.

Banner Image Credit: Eadweard Muybridge/Wikimedia Commons

Edd Gent
Edd Genthttp://www.eddgent.com/
I am a freelance science and technology writer based in Bangalore, India. My main areas of interest are engineering, computing and biology, with a particular focus on the intersections between the three.
Don't miss a trend
Get Hub delivered to your inbox