Intel’s persistent memory technology, Optane, and its DAOS (Distributed Asynchronous Object Storage) stack continue to impress and gain market traction. Yesterday, Intel reported an Optane and DAOS-based system finished atop the latest IO500 released loosely in conjunction with ISC 2020. The showing was no doubt satisfying for Intel which finished second to WekaIO in the last IO500 at SC19 in a near-tie.
There was no closeness in the latest list. Intel’s “Wolf” system (52 nodes) used 30 storage servers running on second-generation Xeon Platinum processors and scored 1792.98 on the IO500 benchmark, besting the second-place entry from AWS/WekaIO (345 nodes) which scored 938.95.
The IO500 was created in 2017 by the Virtual Institute for IO which in a self-descriptive paper noted, “Benchmarking of HPC storage systems is a complex task. Parallel I/O is not only influenced by CPU performance for latency and the networking stack but also on the underlying storage technology and software stack. With the IO-500, we have defined a comprehensive benchmark suite that enables comparison of high-performance storage systems. Similar to the TOP500 list for compute architectures, IO500 will allow tracking performance growth over the years and analyze changes in the storage landscape. The IO-500 will not only provide key metrics of expected performance, but serve as a repository for fostering sharing best practices within the community.”
Two lists are compiled. The main one is a comprehensive list for system entries of all sizes. The second list is for systems with exactly “10 clients enabling a more direct comparison of file system efficiency and per –server performance.” Optane combined with DAOS dominated the top of both categories. In the 10-node challenge, the three Intel Optane/DAOS solutions took the top three rankings (Intel, TACC, and Argonne).
This is all encouraging news for Intel. Argonne, of course, is busily working with Intel on the Aurora supercomputer, which is scheduled to be the first U.S. exascale system and go live in 2021. Aurora will use Optane and DAOS. The latest IO500 showing is a positive indicator for Aurora where Intel’s process stumbles have stirred worry over its ability to deliver the Sapphire Rapids microprocessor and Ponte Vecchio GPU on schedule.
Intel quoted Gordon McPheeters of Argonne Leadership Computing Facility in today’s announcement, “The recent IO-500 results for DAOS demonstrate the continuing maturity of the software’s functionality enabled by a well-managed code development and testing process. The collaborative development program will continue to deliver additional capabilities for DAOS in support of Argonne’s upcoming exascale system, Aurora.”
Intel describes DAOS as “an open source software-defined scale-out object store that provides high bandwidth, low latency, and high I/O operations per second (IOPS) storage containers to HPC applications.”
Shown below are a chart with an overview and a table with a few more metrics for the top twenty IO500 performers.
Broadly, it is interesting to see the rise of new storage technology. Talking about DAOS earlier this year, Ari Berman of the BioTeam research computing consultancy told HPCwire, “It’s too new to say much about DAOS but the concept of asynchronous IO is very interesting. It’s essentially a queue mechanism at the system write level so system waits in the processors don’t have to happen while a confirmed write back comes from the disks. So asynchronous IO allows jobs can keep running while you’re waiting on storage to happen, to a limit of course. That would really improve the data input-output pipelines in those systems. It’s a very interesting idea. I like asynchronous data writes and asynchronous storage access. I can see there very easily being corruption that creeps into those types of things and data without very careful sequencing. It will be interesting to watch. If it works it will be a big innovation.”
A blog by Intel’s Kelsey Rose Prantis briefly describes the IO500 results and digs into the Optane/DAOS architecture.
If you wish to delve a bit more into the IO500 workloads, there is more information at the Virtual Institute for IO website. The benchmark covers various workloads and computes a single score for comparison.
The workloads are:
- IOEasy: Applications with well optimized I/O patterns
- IOHard: Applications that require a random workload
- MDEasy: Metadata/small objects
- MDHard: Small files (3901 bytes) in a shared directory
- Find: Finding relevant objects based on patterns
“The individual performance numbers are preserved and accessible via the web or the raw data. This allows deriving other relevant metrics,” according to the Virtual Institute for IO.
Link to IO500: https://www.vi4io.org/io500/start