For decades, researchers have worked toward scalable data storage in DNA’s four nucleotides (A, T, G and C). The technology, once mastered, would yield millions of times greater efficiency, but impediments in reliability and read/write speeds have stymied the arrival of a true paradigm shift in archival storage. Now, a team led by researchers from the Georgia Tech Research Institute (GTRI) has advanced DNA storage with a microchip that can quickly and cheaply grow DNA strands for high-density storage.
The microchips – which currently exist as proofs of concept – are about one square inch and include ten banks of “wells,” each a few hundred nanometers deep, that house the DNA strands as they grow. “Working with our colleagues at Twist and in Georgia Tech’s Institute for Electronics and Nanotechnology, we have optimized the geometry of the microwells to fit more and more of them on a chip,” explained Nicholas Guise, a senior research scientist at GTRI and project director for the Scalable Molecular Archival Software and Hardware (SMASH) project, in an interview with GTRI’s John Toon.
The final version of the microchips, the researchers say, will include a second layer of electronics to manage the chemical process. After the strands are completed, they will be harvested and dried.
“We’ve been able to show that it’s possible to grow DNA to the sort of length that we want, and at about the feature size that we care about using these chips,” Guise said. “The goal is to grow millions of unique, independent sequences across the chip from these microwells, with each serving as a tiny electrochemical bioreactor.”
To compensate for DNA storage’s high error rate, the researchers also designed a codec to identify and correct errors. “We’ve targeted this codec to be super robust against errors, able to work with devices that read as much as 10% of the bases wrong,” said Adam Meier, a senior scientist with GTRI and SMASH. “What we expect is that eventually the error correction code will be more lightweight. It will eventually have less of an impact on the final design, and when the error rates are better, then the codec will become less important. That’s part of our research into future phases of the program.”
“What this does operationally is allow us to potentially turn up the speed and throughput of the synthesizer and sequencer,” Guise added. “If you can tolerate some of the error through a resilient codec, you can write much more data and read much more data faster.”
To learn more about this research, read the reporting from GTRI’s John Toon here.