HPC Chips – A Veritable Smorgasbord?

By Dairsie Latimer

October 10, 2017

No this isn’t about the song from Charlotte’s Web or the Scandinavian predilection for open sandwiches; it’s about the apparent newfound choice in the HPC CPU market.

For the first time since AMD’s ill-fated launch of Bulldozer the answer to the question, ‘Which CPU will be in my next HPC system?’ doesn’t have to be ‘Whichever variety of Intel Xeon E5 they are selling when we procure’.

In fact, it’s not just in the x86 market where there is now a genuine choice. Soon we will have at least two credible ARM v8 ISA CPUs (from Cavium and Qualcomm respectively) and IBM have gone all in on the Power architecture (having at one point in the last ten years had four competing HPC CPU lines – x86, Blue Gene, Power and Cell).

In fact, it may even be Intel that is left wondering which horse to back in the HPC CPU race with both Xeon lines looking insufficiently differentiated going forward. A symptom of this dilemma is the recent restructuring of the Xeon line along with associated pricing and feature segmentation.

I’m also quite deliberately avoiding the potentially disruptive appearance of a number of radically different computational solutions being honed for machine learning and which will inevitably have some bearing on HPC in the future.

Have we seen peak Intel?

Intel’s 90+ percent market share in the datacentre has for years worried many observers. While their products have undoubtedly been very good, when you have an effective monopoly, the evolutionary pressure that drives innovation and price competitiveness understandably wanes.

“Success breeds complacency. Complacency breeds failure. Only the paranoid survive.” – Andy Grove”

The re-emergence of credible competition can only be a good thing for the wider market, but in HPC things are less clear cut. Intel still holds a strong hand in the game of poker that is HPC procurement, namely AVX-512, but since some of the larger Top500 systems tend to be heterogeneous in nature, is this going to be enough to fend off the challenge from the following pack in other parts of the HPC ecosystem?

IBM and Nvidia are clearly hoping to make significant to make inroads at the top table of HPC with their CORAL generation systems, and Qualcomm and Cavium will also be hoping to chip away at Intel’s monopoly (though they are probably not directly aiming at HPC) but these non-x86 alternatives face significant problems when it comes to showing their capabilities in the HPC space.

AMD have a great opportunity to make gains in the HPC space with their EPYC line (the only x86 competitor) and early signs are encouraging that they will take the fight to Intel and not just on price-performance grounds.

Inertia in HPC is a funny thing

We mainly think of inertia as a property of physical objects but in the HPC industry there is a similar phenomenon relating to application code bases (and languages), instruction sets (and optimised software library ecosystems) and how hard it is to justify doing something different. In the case of HPC, this is really an argument about the barrier to entry for the new HPC CPU vendors, and what they have to be able to demonstrate in order to displace the incumbent (i.e. Intel).

Without trying to evade answering the question, we all hope that the non-Intel vendors can find the right combination of price-performance to chip away at the current Intel dominance in the datacentre. Not because we want to see Intel fail, but because we want them to succeed. Healthy competition is definitely good for users, though less obviously so for Intel’s shareholders.

If all you have is a hammer

“Ah-ha!” I hear you cry, “We already embrace different ISAs and heterogeneity in the Top500.” and indeed we do.  In fact the latest Green500 list is testament to how effective this approach can be. We also know that LINPACK is a historically poor predictor for most actual HPC application performance but we still use it as a flagship benchmark, predominantly because it does a good job of stress testing the computational elements of system architecture. With the march towards exascale now looking more like the retreat from Moscow, there is increasing need to improve the system efficiency for applications that don’t exhibit LINPACK-esque scaling characteristics. Machine learning looks to be the new yardstick so it will be interesting to see the rapid evolution of new solutions and benchmarks.

Moore’s Law in ICU

We should also acknowledge the increasing challenges facing silicon fabrication and process technology. Keeping the Moore’s Law show on the road is hard. This isn’t news to folk in HPC but it is one of the reasons why exascale in under 20MW (anything else looks prohibitively expensive) looks to be an exceedingly challenging goal in the next five years.

Intel are still at the vanguard when it comes to eking out the increasingly esoteric improvements needed, but when you have to re-state what aspects of process naming conventions should matter, you are already rapidly approaching the point of diminishing returns.

Moore’s law is an engine that has historically driven significant growth across the board and enabled the in silico renaissance that most HPC users are engaged in, but it is faltering at just the moment that exascale computing systems need a significant uplift in system efficiency. There still need to be huge improvements in parallelism, memory and storage efficiency, and data transmission and that’s even before you start to consider the considerations around fault recovery and software complexity for such huge systems.

We’ve been fairly good at scrambling over the various ‘walls’ we’ve encountered in the last couple of decades but does anyone else have a feeling that we are at the cusp of a period of innovation in HPC that we haven’t seen for some time?

Benchmark, benchmark, benchmark

For the first time in at least five years, the need for comparative benchmarking, conducted as part of your pre- and tender process, is looking to be an absolutely essential step to deliver the best value. Rather than just being viewed as something that provides a little more confidence that the vendors have tuned the MPI implementation and fabric topology, and you know what compiler flags to flip, it will shine a light into some of the dark musty corners that more complacent software developers and vendors have chosen to ignore. If for no other reason it will ensure that the supported pricing you get from your suppliers will be as keen as it should be.

Dairsie Latimer is a Managing Consultant for Red Oak Consulting.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Neural Network ‘Synapse’ Technology Showcased at IEEE Meeting

December 12, 2018

There’s nice snapshot of advancing work to develop improved neural network “synapse” technologies posted yesterday on IEEE Spectrum. Lower power, ease of use, manufacturability, and performance are all key paramete Read more…

By John Russell

Is Amazon’s Plunge into Server Chips a Watershed Moment?

December 11, 2018

For several years now the big cloud providers – Amazon, Microsoft Azure, Google, et al – have been transforming from technology consumers into technology creators in hardware and software. The most recent example bei Read more…

By John Russell

Mellanox Uses Univa to Extend Silicon Design HPC Operation to Azure

December 11, 2018

Call it a corollary to Murphy’s Law: When a system is most in demand, when end users are most dependent on the system performing as required, when it’s crunch time – that’s when the system is most likely to blow up. Or make you wait in line to use it. Read more…

By Doug Black

HPE Extreme Performance Solutions

AI Can Be Scary. But Choosing the Wrong Partners Can Be Mortifying!

As you continue to dive deeper into AI, you will discover it is more than just deep learning. AI is an extremely complex set of machine learning, deep learning, reinforcement, and analytics algorithms with varying compute, storage, memory, and communications needs. Read more…

IBM Accelerated Insights

Blurring the Lines Between HPC and AI @ SC18

The dominant topic at SC18 was the convergence of HPC and Artificial Intelligence (AI) with some of the biggest research and enterprise HPC users providing perspectives on how HPC and AI are moving closer together. Read more…

Clemson’s Cautionary Cryptomining Tale

December 11, 2018

In some ways, the bigger the computer, the more vulnerable it is to cryptomining as Clemson University discovered after cryptominers dug into its Palmetto supercomputer. When a number of nodes on Clemson University’s P Read more…

By Staff

Topology Can Help Us Find Patterns in Weather

December 6, 2018

Topology--–the study of shapes-- seems to be all the rage. You could even say that data has shape, and shape matters. Shapes are comfortable and familiar conc Read more…

By James Reinders

Zettascale by 2035? China Thinks So

December 6, 2018

Exascale machines (of at least a 1 exaflops peak) are anticipated to arrive by around 2020, a few years behind original predictions; and given extreme-scale performance challenges are not getting any easier, it makes sense that researchers are already looking ahead to the next big 1,000x performance goal post: zettascale computing. Read more…

By Tiffany Trader

Robust Quantum Computers Still a Decade Away, Says Nat’l Academies Report

December 5, 2018

The National Academies of Science, Engineering, and Medicine yesterday released a report – Quantum Computing: Progress and Prospects – whose optimism about Read more…

By John Russell

Revisiting the 2008 Exascale Computing Study at SC18

November 29, 2018

A report published a decade ago conveyed the results of a study aimed at determining if it were possible to achieve 1000X the computational power of the the Read more…

By Scott Gibson

AWS Debuts Lustre as a Service, Accelerates Data Transfer

November 28, 2018

From the Amazon re:Invent main stage in Las Vegas today, Amazon Web Services CEO Andy Jassy introduced Amazon FSx for Lustre, citing a growing body of applicati Read more…

By Tiffany Trader

AWS Launches First Arm Cloud Instances

November 28, 2018

AWS, a macrocosm of the emerging high-performance technology landscape, wants to be everywhere you want to be and offer everything you want to use (or at least Read more…

By Doug Black

Move Over Lustre & Spectrum Scale – Here Comes BeeGFS?

November 26, 2018

Is BeeGFS – the parallel file system with European roots – on a path to compete with Lustre and Spectrum Scale worldwide in HPC environments? Frank Herold Read more…

By John Russell

DOE Under Secretary for Science Paul Dabbar Interviewed at SC18

November 21, 2018

During the 30th annual SC conference in Dallas last week, SC18 hosted U.S. Department of Energy Under Secretary for Science Paul M. Dabbar. In attendance Nov. 13-14, Dabbar delivered remarks at the Top500 panel, met with a number of industry stakeholders and toured the show floor. He also met with HPCwire for an interview, where we discussed the role of the DOE in advancing leadership computing. Read more…

By Tiffany Trader

Quantum Computing Will Never Work

November 27, 2018

Amid the gush of money and enthusiastic predictions being thrown at quantum computing comes a proposed cold shower in the form of an essay by physicist Mikhail Read more…

By John Russell

Cray Unveils Shasta, Lands NERSC-9 Contract

October 30, 2018

Cray revealed today the details of its next-gen supercomputing architecture, Shasta, selected to be the next flagship system at NERSC. We've known of the code-name "Shasta" since the Argonne slice of the CORAL project was announced in 2015 and although the details of that plan have changed considerably, Cray didn't slow down its timeline for Shasta. Read more…

By Tiffany Trader

IBM at Hot Chips: What’s Next for Power

August 23, 2018

With processor, memory and networking technologies all racing to fill in for an ailing Moore’s law, the era of the heterogeneous datacenter is well underway, Read more…

By Tiffany Trader

House Passes $1.275B National Quantum Initiative

September 17, 2018

Last Thursday the U.S. House of Representatives passed the National Quantum Initiative Act (NQIA) intended to accelerate quantum computing research and developm Read more…

By John Russell

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

CERN Project Sees Orders-of-Magnitude Speedup with AI Approach

August 14, 2018

An award-winning effort at CERN has demonstrated potential to significantly change how the physics based modeling and simulation communities view machine learni Read more…

By Rob Farber

AMD Sets Up for Epyc Epoch

November 16, 2018

It’s been a good two weeks, AMD’s Gary Silcott and Andy Parma told me on the last day of SC18 in Dallas at the restaurant where we met to discuss their show news and recent successes. Heck, it’s been a good year. Read more…

By Tiffany Trader

US Leads Supercomputing with #1, #2 Systems & Petascale Arm

November 12, 2018

The 31st Supercomputing Conference (SC) - commemorating 30 years since the first Supercomputing in 1988 - kicked off in Dallas yesterday, taking over the Kay Ba Read more…

By Tiffany Trader

Leading Solution Providers

SC 18 Virtual Booth Video Tour

Advania @ SC18 AMD @ SC18
ASRock Rack @ SC18
DDN Storage @ SC18
HPE @ SC18
IBM @ SC18
Lenovo @ SC18 Mellanox Technologies @ SC18
NVIDIA @ SC18
One Stop Systems @ SC18
Oracle @ SC18 Panasas @ SC18
Supermicro @ SC18 SUSE @ SC18 TYAN @ SC18
Verne Global @ SC18

TACC’s ‘Frontera’ Supercomputer Expands Horizon for Extreme-Scale Science

August 29, 2018

The National Science Foundation and the Texas Advanced Computing Center announced today that a new system, called Frontera, will overtake Stampede 2 as the fast Read more…

By Tiffany Trader

HPE No. 1, IBM Surges, in ‘Bucking Bronco’ High Performance Server Market

September 27, 2018

Riding healthy U.S. and global economies, strong demand for AI-capable hardware and other tailwind trends, the high performance computing server market jumped 28 percent in the second quarter 2018 to $3.7 billion, up from $2.9 billion for the same period last year, according to industry analyst firm Hyperion Research. Read more…

By Doug Black

Nvidia’s Jensen Huang Delivers Vision for the New HPC

November 14, 2018

For nearly two hours on Monday at SC18, Jensen Huang, CEO of Nvidia, presented his expansive view of the future of HPC (and computing in general) as only he can do. Animated. Backstopped by a stream of data charts, product photos, and even a beautiful image of supernovae... Read more…

By John Russell

Germany Celebrates Launch of Two Fastest Supercomputers

September 26, 2018

The new high-performance computer SuperMUC-NG at the Leibniz Supercomputing Center (LRZ) in Garching is the fastest computer in Germany and one of the fastest i Read more…

By Tiffany Trader

Houston to Field Massive, ‘Geophysically Configured’ Cloud Supercomputer

October 11, 2018

Based on some news stories out today, one might get the impression that the next system to crack number one on the Top500 would be an industrial oil and gas mon Read more…

By Tiffany Trader

Intel Confirms 48-Core Cascade Lake-AP for 2019

November 4, 2018

As part of the run-up to SC18, taking place in Dallas next week (Nov. 11-16), Intel is doling out info on its next-gen Cascade Lake family of Xeon processors, specifically the “Advanced Processor” version (Cascade Lake-AP), architected for high-performance computing, artificial intelligence and infrastructure-as-a-service workloads. Read more…

By Tiffany Trader

Google Releases Machine Learning “What-If” Analysis Tool

September 12, 2018

Training machine learning models has long been time-consuming process. Yesterday, Google released a “What-If Tool” for probing how data point changes affect a model’s prediction. The new tool is being launched as a new feature of the open source TensorBoard web application... Read more…

By John Russell

The Convergence of Big Data and Extreme-Scale HPC

August 31, 2018

As we are heading towards extreme-scale HPC coupled with data intensive analytics like machine learning, the necessary integration of big data and HPC is a curr Read more…

By Rob Farber

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This