It’s been a heady two weeks for the ARM HPC advocacy camp. At this week’s Mont-Blanc Project meeting held at the Barcelona Supercomputer Center, Cray announced plans to build an ARM-based supercomputer in the U.K. while Mont-Blanc selected Cavium’s ThunderX2 ARM chip for its third phase of development. Last week, France’s CEA and Japan’s RIKEN announced a deep collaboration aimed largely at fostering the ARM ecosystem. This activity follows a busy 2016 when SoftBank acquired ARM, OpenHPC announced ARM support, ARM released its SVE spec, Fujistu chose ARM for the post K machine, and ARM acquired HPC tool provider Allinea in December.
The pieces of an HPC ecosystem for ARM seem to be sliding, albeit unevenly, into place. Market traction in terms of HPC product still seems far off – there needs to be product available after all – but the latest announcements suggest growing momentum in sorting out the needed components for potential ARM-based HPC offerings. Plenty of obstacles remain – Fujitsu’s much-discussed ARM-based post K computer schedule has been delayed amid suggestions that processor issues are the main cause. Nevertheless interest in ARM for HPC is rising.
The biggest splash at this week’s Mont-Blanc project meeting was announcement of Cray’s plans to build a massive ARM supercomputer for the GW4 consortium (GW4) in the U.K. On first glance, it looks to be the first production ARM-based supercomputer. Named Isambard after the Victorian engineer Isambard Kingdom Brunel, the new system is scheduled for delivering in the March-December 2017 timeframe. Importantly, Isambard “will provide multiple advanced architectures within the same system in order to enable evaluation and comparison across a diverse range of hardware platforms.”
Project leader and professor of HPC at the University of Bristol, Simon McIntosh-Smith, said “Scientists have a growing choice of potential computer architectures to choose from, including new 64-bit ARM CPUs, graphics processors, and many-core CPUs from Intel. Choosing the best architecture for an application can be a difficult task, so the new Isambard GW4 Tier 2 HPC service aims to provide access to a wide range of the most promising emerging architectures, all using the same software stack. [It’s] a unique system that will enable direct ‘apples-to-apples’ comparisons across architectures, thus enabling UK scientists to better understand which architecture best suits their application.”
Here’s a quick Isambard snapshot:
- Cray CS-400 system
- 10,000+ 64-bit ARMv8 cores
- HPC optimized stack
- Be used to compare x86, Knights Landing, and Pascal processors
- Cost £4.7 million over three years
The specific ARM chip planned for use was not named although the speculation is it’s likely to be a Cavium. The new machine will be hosted by the U.K. Met (climate/weather forecasting) agency. Paul Selwood, Manager for HPC Optimization at the Met Office said: “This system will enable us, in co-operation with our partners, to accelerate insights into how our weather and climate models need to be adapted for these emerging CPU architectures,” in the release announcing the project. The GW4 Alliance brings together four leading research-intensive universities: Bath, Bristol, Cardiff and Exeter.
The second splash at the BSC meeting was perhaps less spectacular but also important. The Mont-Blanc project has been percolating along since 2011. A smaller prototype was stood up in 2015 and it seems clear much of Europe is hoping that ARM-based processors will offer an HPC alternative and greater European control over its exascale efforts. Cavium’s ThunderX2 chip – a 64-bit ARMv8-A server processor that’s compliant with ARMv8-A architecture specifications and ARM SBSA and SBBR standards – will power the third phase prototype.
Mont-Blanc, of course, is the European effort to explore how ARM can be practically scaled for larger machines including future exascale systems. Atos/Bull is the primary contractor. The third phase of the Mont-Blanc project seeks to:
- Define the architecture of an Exascale-class compute node based on the ARM architecture, and capable of being manufactured at industrial scale.
- Assess the available options for maximum compute efficiency.
- Develop the matching software ecosystem to pave the way for market acceptance.
The CEA-Riken collaboration announced last week is yet another ARM ecosystem momentum builder. “We are committed to building the ARM-based ecosystems and we want to send that message to those who are related to ARM so that those people will be excited in getting in contact with us,” said Shig Okaya, director, Flagship 2020, and a project leader for the CEA-RIKEN effort. It will, among other things, focus on and programming languages, execution materials, and work schedulers optimized for energy. Co-development of codes and code sharing are big parts of the deal. (HPCwire covers the CEA-RIKEN partnership in greater detail here).
Whether the increased attention on ARM will translate into success beyond the mobile and SOC world where it is now a dominant player isn’t clear. One of CEA’s goals is to compare ARM with a range or architectures to determine which performs best and for which workloads. Many market watchers are wary of ARM’s potential in HPC, which is still a relatively small market. Then again, less success in HPC wouldn’t necessarily rule out success in traditional servers. We’ll see.