Later this month IARPA will hold a Proposers’ Day to kick off its planned, four-year Molecular Information Storage (MIST) project. “Today’s exabyte-scale data centers occupy large warehouses, consume megawatts of power, and cost billions of dollars to build, operate and maintain over their lifetimes. This resource intensive model does not offer a tractable path to scaling beyond the exabyte regime in the future,” says IARPA.
The search for new storage technologies is hardly new. In recent years the proliferation of data generating devices (scientific instruments and commercial IoT) and the rise of AI and data analytics capabilities to make use of vast datasets have boosted pressure to find alternative approaches to storage.
The MIST program is expected to last four years and be composed of two 24-month phases. “The desired capabilities” for both phases of the program are described by three Technical Areas (TAs):
- TA1 (Storage).Develop a table-top device capable of writing information to molecular media with a target throughput and resource utilization budget. Multiple, diverse approaches are anticipated, which may utilize DNA, polypeptides, synthetic polymers, or other sequence-controlled polymer media.
- TA2 (Retrieval).Develop a table-top device capable of randomly accessing information from molecular media with a target throughput and resource utilization budget. Multiple, diverse approaches are anticipated, which may utilize optical sequencing methods, nanopores, mass spectrometry, or other methods for sequencing polymers in a high-throughput manner.
- TA3 (Operating System). Develop an operating system for use with storage and retrieval devices that coordinates addressing, data compression, encoding, error-correction and decoding of files from molecular media in a manner that supports efficient random access at scale. Multiple, diverse approaches are anticipated, which may draw on established methods from the storage industry, or develop new methods to accommodate constraints imposed by polymer media. The end result of the program will be technologies that jointly support end-to-end storage and retrieval at the terabyte scale, and which present a clear and commercially viable path to future deployment at the exabyte scale. Collaborative efforts and teaming among potential performers is highly encouraged.
“The scale and complexity of the world’s “big data” problems are increasing rapidly,” said MIST program manager, David Markowitz. “Use cases that require storage and random access from exabytes of mostly unstructured data are now well-established in the private sector and are of increasing relevance to the public sector.” Registration closes on February 14, 2018.
Not surprisingly, IARPA is emphasizing the multidisciplinary nature of the project Among disciplines expected to be tapped are: chemistry, synthetic biology, molecular biology, biochemistry, bioinformatics, microfluidics, semiconductor engineering, computer science and information theory. IARPA is seeking participation from academic institutions and companies from around the world.
The proposer’s day is February 212. Here’s a link to the program announcement: https://www.iarpa.gov/index.php/research-programs/mist