A Novel Method for Storing Digital Data, Inspired by Nature

A team of Technion researchers is finding a way to encode information in DNA, taking cues from nature to transform data storage
In today’s digitally driven world, as we type, snap, record, and save, we accumulate a veritable avalanche of data. This data is currently stored in buildings the size of football fields that consume vast quantities of energy. Is there a better, more efficient way?
A team of Technion researchers, led by Prof. Eitan Yaakobi in the Henry and Marilyn Taub Faculty of Computer Science, says yes. By encoding data in microscopic strands of DNA, Yaakobi’s team is leveraging nature to significantly reduce the space and energy needed to archive our digital lives.
“Nature uses DNA to store information, so we wondered, ‘Why not do the same’”? he says.
This work builds on decades of prior research. In the late 1950s, the well-known American physicist Richard Feynman first proposed the idea of storing information in DNA. Then, in the 2010s, research teams from Harvard and Oxford independently conducted the first large-scale proof-of-concept experiments, inspiring the contemporary wave of research that Prof. Yaakobi is riding.
Prof. Yaakobi’s method works by translating the language of computers, 0s and 1s or “bits,” into the language of DNA, molecules known as nucleotides. The four nucleotides that comprise DNA are adenine (A), thymine (T), guanine (G), and cytosine (C). The nucleotides couple to form base pairs, which in turn are strung together like rungs on a ladder to form DNA’s double helix structure, which resembles a winding staircase.
Like an alphabet, nucleotides convey meaning through the order in which they are arranged. In Prof. Yaakobi’s storage method, each nucleotide corresponds to a different combination of 0s and 1s. When strung together, the nucleotides encode a full series of bits. These series of bits, in turn, encode digital data such as text files, audio files, photos, or videos.
Prof. Yaakobi’s method relies on the creation of actual DNA strands using a machine known as a synthesizer, which sews together base pairs to form DNA. The DNA is then stored, spaghetti-like, in a tiny container.
On the other side of the process, the information encoded in the DNA is retrieved through a process known as DNA sequencing, which reverses the steps carried out by the synthesizer. The DNA is fed into a device known as a sequencer, which reads the DNA like a book and displays a long series of letters onto a computer: A, T, C, and G. The computer then translates this series of letters into a series of 0s and 1s.
This method of storing data is superior to existing methods in that it requires significantly less space and energy, thus preserving natural resources and eliminating harms to the environment.
“The information contained in a million thumb drives can be captured in DNA the size of one thumb drive,” Prof. Yaakobi explains. “To put it another way, the DNA contained in one human body could theoretically store more bits than exist in the entire universe.”
Realizing the promise of DNA data storage hinges on researchers’ ability to refine the method. The DNA synthesis process is extremely expensive, and both the synthesis and retrieval processes are very slow. What’s more, both processes are prone to errors, with the equivalent of typos occurring when writing and subsequently reading the DNA.
Prof. Yaakobi’s research focuses on correcting these errors using a mechanism known as DNAformer. The system works by comparing multiple copies of a given DNA sequence to identify and correct errors. It pieces together the “truth” by finding commonalities across the copies, in the same way a criminal investigator arrives at the truth by interviewing multiple witnesses. Prof. Yaakobi explains that future research efforts will focus not only on correcting errors, but also on reducing the cost of DNA synthesis.
The method’s real-world applications, as Yaakobi explains it, are vast. Most directly, the DNA storage method would transform how we store digital data, rending large, energy-hungry data centers irrelevant. Refining the data retrieval process would, moreover, transform our ability to sequence biological DNA, with implications for genetic engineering, medicine, and criminal justice. Finally, we could use DNA for authentication purposes: imagine, for instance, tagging a luxury car with unique DNA. While a criminal could easily replace a license plate and erase a VIN number, he would have a much harder time eliminating microscopic strands of DNA scattered throughout the cabin.
For Yaakobi, the Technion is the perfect place to move this research forward.
“I can’t imagine a better place to conduct this research than the Technion, which is home to outstanding faculty members and the brightest students in the world,” he says. “At the Technion, we can collaborate with leading researchers in biology, chemistry, bioinformatics, and mechanical engineering, enabling us to advance this research much more quickly than would otherwise be possible.”