As Neils Bohr said, “Prediction is very difficult, especially if it’s about the future.” Regarding, storage technology futures, the smart money has been riding on NVMe to deliver the step change required for the coming decades. The past five years or so have been notable for the introduction of NAND flash.
Along with increasing NAND capacities and the emergence of ultra-low latency, byte-addressable NVM, the timely uptake of the NVM Express (NVMe) standard is coming into play. NVMe allows the low-latency potential of Non-Volatile memory devices to be realized across fabrics, removing the prior limitations imposed by SCSI layers. Together, these device and protocol advances place the remaining bottleneck to applications squarely at the I/O software layer and parallel file systems. These file systems were developed when the underlying device latency was in the millisecond range when a thick layer of software incurring millisecond-order latencies was perfectly acceptable. Take that same layer and introduce a flash backend and the fast file system becomes an IOPS barrier between application and storage media.
DDN’s initiative to bridge this chasm between application and ultra-low latency storage began from scratch in 2012. The project that was to become Infinite Memory Engine® (IME®) was designed fundamentally to address the sub-microsecond latencies expected by 2020+ and, unlike classic all flash arrays, to do so at supercomputer scale. Today, IME is scale-out, flash-native, software-defined, storage cache that integrates with parallel file systems to support the most demanding of I/O workloads. IME interfaces directly with applications and secures I/O into an array of NVM servers via a data path that eliminates file system bottlenecks.
IME’s ground-up implementation also allowed DDN® to address many other shortcomings of the parallel file systems. IME is “write-anywhere,” allowing clients to change their data transmission rates to servers depending on load. This prevents the classic “Amdahl’s law” drawback of parallel file systems whereby individual slow-performing storage devices and servers can impact the whole application workload. IME is also flash-optimized. This brings benefits in SSD management, delivering consistent performance and longer lifetime for NAND flash. Rebuilds are highly declustered, resulting in complete, large-capacity data rebuilds in a few minutes.
IME is a scale-out cache that sits in front of parallel file-systems. As such, it introduces flash-cache economics: system architects can reduce both capital and recurring (power/cooling/footprint) spend through the decoupling of performance and capacity, using IME to deliver on IOPS and throughput targets, and a backing parallel file system with large capacity drives to meet storage volume requirements.
The neatest bit is IME’s ability to solve the broadest spectrum of I/O problems. Managing the saddle distribution of I/O sizes in HPC is a difficult problem for file systems, and new application methods such as multi-scale physics, adaptive mesh refinement, and ever more complex workloads are adding to the tougher components of I/O workloads. The rapidly developing field of supercomputer-scale analytics and machine-learning exacerbates the problem, both by introducing tough read workloads and by much greater concurrency (number of threads), since they typically take advantage of many core, often heterogeneous, compute environments. Now the I/O is characterized not by an ideal, large I/O, sequential access, but rather a complex mixture of large, small, random, unaligned, high-concurrency I/O in read and write workloads which require both streaming performance and high IOPs. HPC file systems have exceled at gaining the maximum large I/O throughput from each HDD, but small I/O management has been very limited. IME can support reads and writes with I/O sizes ranging from large to tiny I/Os of 4K with the same blistering performance.
To learn more about IME, please visit the DDN website.