IBM Begins Power9 Rollout with Backing from DOE, Google

By Tiffany Trader

December 6, 2017

After over a year of buildup, IBM is unveiling its first Power9 system based on the same architecture as the Department of Energy CORAL supercomputers, Summit and Sierra. The new AC922 server pairs two Power9 CPUs with four or six Nvidia Tesla V100 NVLink GPUs. IBM is positioning the Power9 architecture as “a game-changing powerhouse for AI and cognitive workloads.”

The AC922 extends many of the design elements introduced in Power8 “Minsky” boxes with a focus on enabling connectivity to a range of accelerators – Nvidia GPUs, ASICs, FPGAs, and PCIe-connected devices — using an array of interfaces. In addition to being the first servers to incorporate PCIe Gen4, the new systems support the NVLink 2.0 and OpenCAPI protocols, which offer nearly 10x the maximum bandwidth of PCI-E 3.0 based x86 systems, according to IBM.

IBM AC922 rendering

“We designed Power9 with the notion that it will work as a peer computer or a peer processor to other processors,” said Sumit Gupta, vice president of of AI and HPC within IBM’s Cognitive Systems business unit, ahead of the launch. “Whether it’s GPU accelerators or FPGAs or other accelerators that are in the market, our aim was to provide the links and the hooks to give all these accelerators equal footing in the server.”

In the coming months and years there will be additional Power9-based servers to follow from IBM and its ecosystem partners, but this launch is all about the flagship AC922 platform and specifically its benefits to AI and cognitive computing – something Ken King, general manager of OpenPOWER for IBM Systems Group, shared with HPCwire when we sat down with him at SC17 in Denver.

“We didn’t build this system just for doing traditional HPC workloads,” King said. “When you look at what Power9 has with NVLink 2.0 we’re going from 80 gigabits per second throughput [in NVLink 1.0] to over 150 gigabits per second throughput. PCIe Gen3 only has 16. That GPU to CPU I/O is critical for a lot of the deep learning and machine learning workloads.”

Coherency, which Power9 introduces via both CAPI and NVLink 2.0, is another key enabler. As AI models grow large, they can easily outgrow GPU memory capacity, but the AC922 addresses these concerns by allowing accelerated applications to leverage system memory as GPU memory. This reduces latency and simplifies programming by eliminating data movement and locality requirements.

The AC922 server can be configured with either four or six Nvidia Volta V100 GPUs. According to IBM, a four GPU air-cooled version will be available December 22 and both four- and six-GPU water-cooled options are expected to follow in the second quarter of 2018.

While the new Power9 boxes have gone by a couple different codenames (“Witherspoon” and “Newell”), we’ve also heard folks at IBM refer to them informally as their “Summit servers” and indeed there is great visibility in being the manufacturer for what is widely expected to be the United States’ next fastest supercomputer. Thousands of the AC922 nodes are being connected together along with storage and networking to drive approximately 200 petaflops at Oak Ridge and 120 petaflops at Lawrence Livermore.

As King pointed out in reference to the delayed and retooled Argonne “Aurora” system, only one of the original CORAL contractors is fulfilling its mission to deliver “pre-exascale” supercomputing capability to the collaboration of US labs.

IBM has also been tapped by Google, which with partner Rackspace is building a server with Power9 processors called Zaius. In a prepared statement, Bart Sano, vice president of Google Platforms, praised “IBM’s progress in the development of the latest POWER technology” and said “the POWER9 OpenCAPI Bus and large memory capabilities allow for further opportunities for innovation in Google data centers.”

IBM sees the hyperscale market as “a good volume opportunity” but is obviously aware of the impact that volume pricing has had on the traditional server market. “We do see strong pull from them, but we have many other elements in play,” said Gupta. “We have solutions that go after the very fast-growing AI space, we have solutions that go after the open source databases, the NoSQL datacenters. We have announced a partnership with Nutanix to go after the hyperconverged space. So if you look at it, we have lots of different elements that drive the volume and opportunity around our Linux on Power servers, including of course SAP HANA.”

IBM will also be selling Power9 chips through its OpenPower ecosystem, which now encompasses 300 members. IBM says it’s committed to deploying three versions of the Power9 chip, one this year, one in 2018 and another in 2019. The scale-out variant is the one it is delivering with CORAL and with the AC922 server. “Then there will be a scale-up processor, which is the traditional chip targeted towards the AIX and the high-end space and then there’s another one that will be more of an accelerated offering with enhanced memory and other features built into it; we’re working with other memory providers to do that,” said King.

He added that there might be another version developed outside of IBM, leveraging OpenPower, which gives other organizations the opportunity to utilize IBM’s intellectual property to build their own differentiated chips and servers.

King is confident that the demand for IBM’s latest platform is there. “I think we are going to see strong out-of-the-chute opportunities for Power9 in 2018. We’re hoping to see some growth this quarter with the solution that we’re bringing out with CORAL but that will be more around the ESP customers. Next year is when we’re expecting that pent up demand to start showing positive return overall for our business results.”

A lot is riding on the success of Power9 after Power8 failed to generate the kind of profits that IBM had hoped for. There was growth in Power8’s first year, said King, but after that sales tailed off. He added that capabilities like Nutanix and building PowerAI and other software based solutions on top of it have led to a bit of a rebound. “It’s still negative but it’s low negative,” he said, “but it’s sequentially grown quarter to quarter in the last three quarters, since Bob Picciano [SVP of IBM Cognitive Systems] came on.”

Several IBM reps we spoke with acknowledged that pricing – or at least pricing perception – was a problem for Power8.

“For our traditional market I think pricing was competitive; for some of the new markets that we’re trying to get into, like the hyperscaler datacenters, I think we’ve got some work to do,” said King. “It’s really a TCO and a price-performance competitiveness versus price only. And we think we’re going to have a much better price performance competitiveness with Power9 in the hyperscalers and some of the low-end Linux spaces that are really the new markets.”

“We know what we need to do for Power9 and we’re very confident with a lot of the workload capabilities that we’ve built on top of this architecture, that we’re going to see a lot more growth, positive growth, on Power9, with PowerAI with Nuta,nix with some of the other workloads we’ve put in there. And it’s not going to be a hardware-only reason,” King continued. “It’s going to be a lot of the software capabilities that we’ve built on top of the platform, and supporting more of the newer workloads that are out there. If you look at the IDC studies of the growth curve of cognitive infrastructure, it goes from about $1.6 billion to $4.5 billion over the next two or three years – it’s a huge hockey stick – and we have built and designed Power9 for that market, specifically and primarily for that market.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Neural Network ‘Synapse’ Technology Showcased at IEEE Meeting

December 12, 2018

There’s nice snapshot of advancing work to develop improved neural network “synapse” technologies posted yesterday on IEEE Spectrum. Lower power, ease of use, manufacturability, and performance are all key paramete Read more…

By John Russell

Is Amazon’s Plunge into Server Chips a Watershed Moment?

December 11, 2018

For several years now the big cloud providers – Amazon, Microsoft Azure, Google, et al – have been transforming from technology consumers into technology creators in hardware and software. The most recent example bei Read more…

By John Russell

Mellanox Uses Univa to Extend Silicon Design HPC Operation to Azure

December 11, 2018

Call it a corollary to Murphy’s Law: When a system is most in demand, when end users are most dependent on the system performing as required, when it’s crunch time – that’s when the system is most likely to blow up. Or make you wait in line to use it. Read more…

By Doug Black

HPE Extreme Performance Solutions

AI Can Be Scary. But Choosing the Wrong Partners Can Be Mortifying!

As you continue to dive deeper into AI, you will discover it is more than just deep learning. AI is an extremely complex set of machine learning, deep learning, reinforcement, and analytics algorithms with varying compute, storage, memory, and communications needs. Read more…

IBM Accelerated Insights

Blurring the Lines Between HPC and AI @ SC18

The dominant topic at SC18 was the convergence of HPC and Artificial Intelligence (AI) with some of the biggest research and enterprise HPC users providing perspectives on how HPC and AI are moving closer together. Read more…

Clemson’s Cautionary Cryptomining Tale

December 11, 2018

In some ways, the bigger the computer, the more vulnerable it is to cryptomining as Clemson University discovered after cryptominers dug into its Palmetto supercomputer. When a number of nodes on Clemson University’s P Read more…

By Staff

Topology Can Help Us Find Patterns in Weather

December 6, 2018

Topology--–the study of shapes-- seems to be all the rage. You could even say that data has shape, and shape matters. Shapes are comfortable and familiar conc Read more…

By James Reinders

Zettascale by 2035? China Thinks So

December 6, 2018

Exascale machines (of at least a 1 exaflops peak) are anticipated to arrive by around 2020, a few years behind original predictions; and given extreme-scale performance challenges are not getting any easier, it makes sense that researchers are already looking ahead to the next big 1,000x performance goal post: zettascale computing. Read more…

By Tiffany Trader

Robust Quantum Computers Still a Decade Away, Says Nat’l Academies Report

December 5, 2018

The National Academies of Science, Engineering, and Medicine yesterday released a report – Quantum Computing: Progress and Prospects – whose optimism about Read more…

By John Russell

Revisiting the 2008 Exascale Computing Study at SC18

November 29, 2018

A report published a decade ago conveyed the results of a study aimed at determining if it were possible to achieve 1000X the computational power of the the Read more…

By Scott Gibson

AWS Debuts Lustre as a Service, Accelerates Data Transfer

November 28, 2018

From the Amazon re:Invent main stage in Las Vegas today, Amazon Web Services CEO Andy Jassy introduced Amazon FSx for Lustre, citing a growing body of applicati Read more…

By Tiffany Trader

AWS Launches First Arm Cloud Instances

November 28, 2018

AWS, a macrocosm of the emerging high-performance technology landscape, wants to be everywhere you want to be and offer everything you want to use (or at least Read more…

By Doug Black

Move Over Lustre & Spectrum Scale – Here Comes BeeGFS?

November 26, 2018

Is BeeGFS – the parallel file system with European roots – on a path to compete with Lustre and Spectrum Scale worldwide in HPC environments? Frank Herold Read more…

By John Russell

DOE Under Secretary for Science Paul Dabbar Interviewed at SC18

November 21, 2018

During the 30th annual SC conference in Dallas last week, SC18 hosted U.S. Department of Energy Under Secretary for Science Paul M. Dabbar. In attendance Nov. 13-14, Dabbar delivered remarks at the Top500 panel, met with a number of industry stakeholders and toured the show floor. He also met with HPCwire for an interview, where we discussed the role of the DOE in advancing leadership computing. Read more…

By Tiffany Trader

Quantum Computing Will Never Work

November 27, 2018

Amid the gush of money and enthusiastic predictions being thrown at quantum computing comes a proposed cold shower in the form of an essay by physicist Mikhail Read more…

By John Russell

Cray Unveils Shasta, Lands NERSC-9 Contract

October 30, 2018

Cray revealed today the details of its next-gen supercomputing architecture, Shasta, selected to be the next flagship system at NERSC. We've known of the code-name "Shasta" since the Argonne slice of the CORAL project was announced in 2015 and although the details of that plan have changed considerably, Cray didn't slow down its timeline for Shasta. Read more…

By Tiffany Trader

IBM at Hot Chips: What’s Next for Power

August 23, 2018

With processor, memory and networking technologies all racing to fill in for an ailing Moore’s law, the era of the heterogeneous datacenter is well underway, Read more…

By Tiffany Trader

House Passes $1.275B National Quantum Initiative

September 17, 2018

Last Thursday the U.S. House of Representatives passed the National Quantum Initiative Act (NQIA) intended to accelerate quantum computing research and developm Read more…

By John Russell

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

CERN Project Sees Orders-of-Magnitude Speedup with AI Approach

August 14, 2018

An award-winning effort at CERN has demonstrated potential to significantly change how the physics based modeling and simulation communities view machine learni Read more…

By Rob Farber

AMD Sets Up for Epyc Epoch

November 16, 2018

It’s been a good two weeks, AMD’s Gary Silcott and Andy Parma told me on the last day of SC18 in Dallas at the restaurant where we met to discuss their show news and recent successes. Heck, it’s been a good year. Read more…

By Tiffany Trader

US Leads Supercomputing with #1, #2 Systems & Petascale Arm

November 12, 2018

The 31st Supercomputing Conference (SC) - commemorating 30 years since the first Supercomputing in 1988 - kicked off in Dallas yesterday, taking over the Kay Ba Read more…

By Tiffany Trader

Leading Solution Providers

SC 18 Virtual Booth Video Tour

Advania @ SC18 AMD @ SC18
ASRock Rack @ SC18
DDN Storage @ SC18
HPE @ SC18
IBM @ SC18
Lenovo @ SC18 Mellanox Technologies @ SC18
NVIDIA @ SC18
One Stop Systems @ SC18
Oracle @ SC18 Panasas @ SC18
Supermicro @ SC18 SUSE @ SC18 TYAN @ SC18
Verne Global @ SC18

TACC’s ‘Frontera’ Supercomputer Expands Horizon for Extreme-Scale Science

August 29, 2018

The National Science Foundation and the Texas Advanced Computing Center announced today that a new system, called Frontera, will overtake Stampede 2 as the fast Read more…

By Tiffany Trader

HPE No. 1, IBM Surges, in ‘Bucking Bronco’ High Performance Server Market

September 27, 2018

Riding healthy U.S. and global economies, strong demand for AI-capable hardware and other tailwind trends, the high performance computing server market jumped 28 percent in the second quarter 2018 to $3.7 billion, up from $2.9 billion for the same period last year, according to industry analyst firm Hyperion Research. Read more…

By Doug Black

Nvidia’s Jensen Huang Delivers Vision for the New HPC

November 14, 2018

For nearly two hours on Monday at SC18, Jensen Huang, CEO of Nvidia, presented his expansive view of the future of HPC (and computing in general) as only he can do. Animated. Backstopped by a stream of data charts, product photos, and even a beautiful image of supernovae... Read more…

By John Russell

Germany Celebrates Launch of Two Fastest Supercomputers

September 26, 2018

The new high-performance computer SuperMUC-NG at the Leibniz Supercomputing Center (LRZ) in Garching is the fastest computer in Germany and one of the fastest i Read more…

By Tiffany Trader

Houston to Field Massive, ‘Geophysically Configured’ Cloud Supercomputer

October 11, 2018

Based on some news stories out today, one might get the impression that the next system to crack number one on the Top500 would be an industrial oil and gas mon Read more…

By Tiffany Trader

Intel Confirms 48-Core Cascade Lake-AP for 2019

November 4, 2018

As part of the run-up to SC18, taking place in Dallas next week (Nov. 11-16), Intel is doling out info on its next-gen Cascade Lake family of Xeon processors, specifically the “Advanced Processor” version (Cascade Lake-AP), architected for high-performance computing, artificial intelligence and infrastructure-as-a-service workloads. Read more…

By Tiffany Trader

Google Releases Machine Learning “What-If” Analysis Tool

September 12, 2018

Training machine learning models has long been time-consuming process. Yesterday, Google released a “What-If Tool” for probing how data point changes affect a model’s prediction. The new tool is being launched as a new feature of the open source TensorBoard web application... Read more…

By John Russell

The Convergence of Big Data and Extreme-Scale HPC

August 31, 2018

As we are heading towards extreme-scale HPC coupled with data intensive analytics like machine learning, the necessary integration of big data and HPC is a curr Read more…

By Rob Farber

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This