Work by Australia’s National Computational Infrastructure (NCI) to integrate IBM Power8 technology with NCI’s existing Raijin supercomputer is being heralded by IBM as more evidence of IBM/OpenPOWER architecture’s growing strength in the scientific community and distinct advantage for some AI and data-intensive HPC workloads.
NCI purchased four Power8-based HPC servers last December and has been integrating them with the x86-cum-GPU-based Raijin since. NCI has also been porting key applications to the IBM architecture. It is reportedly the first institution to port Q-Chem, a quantum chemistry package, to Power Systems, and its optimization of NAMD molecular dynamics code is said to have outperformed Intel Broadwell x86 architecture on some benchmarks.
“To be the first ever Australian organization to join the OpenPOWER Foundation provides recognition of NCI’s standing, and represents a step toward a more heterogeneous architecture,” said NCI’s Allan Williams, associate director (Services and Technologies), in today’s IBM release. Heterogeneous computing, of course, is at the heart of IBM and OpenPOWER Foundation’s strategy.
Dave Turek, IBM VP for Exascale, wrote in a coinciding blog today, “For now, NCI researchers are utilizing these nodes because the extraordinary memory bandwidth by itself provides significant performance advantage for some of their applications. But, as familiarity with the technology increases, it is anticipated that the IBM Power Systems nodes will present the opportunity for those researchers to explore the intersection of AI and HPC across a wide range of scientific applications. The point, of course, is that the IBM Power Systems design presents to clients a package of HPC and AI capability all rolled into one.”
According to IBM, and based on the benchmarks from the research testbed, it is anticipated that a wide range of scientific disciplines could eventually benefit from optimized performance under Power architecture, including (but not limited to) the fields of physics, biology and chemistry. IBM has for some time promoted the Power8 and Power8+ servers well suited for AI. Details of the servers deployed by NCI were not disclosed.
Raijin, named after the Shinto God of thunder, lightning and storms, is a hybrid Fujitsu Primergy and Lenovo NeXtScale high-performance, distributed-memory cluster. It currently comprises:
- 84,656 cores (Intel Xeon Sandy Bridge 2.6 GHz, Broadwell 2.6 GHz) in 4416 compute nodes
- 120 NVIDIA Tesla K80 GPUs in 30 nodes and 8 NVIDIA Tesla P100 GPUs in 2 nodes
- 32 Intel Xeon Phi (64 core Knights Landing, 1.3 GHz) in 32 compute nodes
- 300 Terabytes of main memory
- Hybrid FDR/EDR Mellanox Infiniband full fat tree interconnect (up to 100 Gb/sec)
- 8 Petabytes of high-performance operational storage capacity
Turek observed in his blog, “Change is hard for many to accept immediately. And the HPC community, for all its technological innovation over the years, is no different: for many, the only thing that matters is the number of peak flops in a system (regardless of whether or not they can ever be used or even if they are helpful to any extent in optimizing a complex workflow). But change has a way of overcoming the inertia of the past as value of the new approach manifests. The future of HPC will likely no longer be tied to the number of flops, but more closely to the insight that is generated. And the revolution going on at this moment, where AI and HPC can come together to yield potentially dramatic enhanced value, may well be the catalyst to facilitate the change.”
Link to IBM release: https://www-03.ibm.com/press/au/en/pressrelease/53331.wss
Link to Turek blog: https://www.ibm.com/blogs/systems/the-ai-revolution-in-hpc/