The @hpcnotes Predictions for HPC in 2018

By Andrew Jones

January 4, 2018

I’m not averse to making predictions about the world of High Performance Computing (and Supercomputing, Cloud, etc.) in person at conferences, meetings, causal conversations, etc.; however, it turns out to be a while since I have stuck my neck out and widely published my predictions for the year ahead in HPC. Of course, such predictions tend to be evenly split between inspired foresight and misguided idiocy. At least some of the predictions will have readers spluttering coffee in indignation at how wrong I am. But, where would the fun in HPC be if we all played safe? So, here goes for the @hpcnotes predictions for HPC in 2018 …

Intel

After spending much of 2017 being called out for ambitiously high pricing of Skylake for HPC customers, and following that with the months of Xeon Phi confusion – eventually publicly admitting at SC17 that Knights Hill has been cancelled, still not clear about the future of Phi overall – Intel seems to have continued into 2018 in the worst way, with news of kernel memory hardware bugs flooding the IT news and social media space. [NB: these bugs have now been confirmed to affect CPUs from AMD, ARM and other vendors too.] 2018 will also see widespread availability of AMD EPYC, Cavium ThunderX2, and IBM Power9 processors and so it seems Intel has a tough year ahead. The hardware bug is especially painful here as it negates the “Intel is the safe option” thinking. To be clear, HPC community consensus so far (including NAG’s impartial benchmarking work with customer codes) says Skylake is a very capable and performance leading processor. However, Skylake has three possible let downs: (1) price substantially higher, relative to the benefits gained, than customers are comfortable with; (2) reduced cache per core compared with other CPUs; (3) dependence on a code’s saturation of the vector units to extract the maximum performance. In some early benchmarks, EPYC and TX2 are winning on both price and performance. My prediction is that Intel will meaningfully drop the Skylake price early in 2018 to pull back into a competitive position on price/performance.

AI and ML

Sorry, the media and marketing hype for AI/ML taking over HPC shows no sign of going away. Yes, there are many real use cases for AI and ML (e.g., follow Paige Bailey and colleagues for real examples); however, the aggressive insertion of AI and ML labels into every HPC-related conference agenda (taking over from the mandatory mentions of Big Data) doesn’t add a lot of value, I think. I’m not suggesting that the HPC community (users or providers) ignore AI/ML – indeed, I would firmly advocate that you add these to your portfolio. But, HPC is an exceptionally powerful and widely applicable tool in its own right – it doesn’t need AI/ML to justify itself. My prediction is that AI/ML will continue to hog a share of the HPC marketing noise unrelated to the scale of actual use in the HPC arena.

New processors

As noted above, 2018 sees credible HPC processors from AMD (EPYC), Cavium (ThunderX2) and other ARM chips, and IBM (Power9) surge into general availability. In my view, these are not (yet) competing with Intel Xeon; they are competing with each other to be the best of the rest. Depending on how Intel behaves (NB: this is not just about technology) and how well AMD/ARM/IBM and their system partners actually execute on promises, one of these might close out 2018 being a serious competitor to Intel’s dominance of the HPC processor space. Either way, I predict we will see at least one meaningful (i.e., competitively won, large scale, for production use) HPC deployment of each of these processors in 2018. I’m also going to add a second prediction to this section: a MIPS based processor option will start to gain headlines as a real HPC processor candidate in 2018 (not just in China).

Cloud

In most cases, HPC is still cheaper and more capable through traditional in-house systems than via cloud deployments. No amount of marketing changes that. Time might change it, but not by the end of 2018. However, cloud as an option for HPC is not going away. It does present a real option for many HPC workloads, and not just trivial workloads. I am hopeful we are at the end of the era where the cloud providers hoped to succeed by trying to convince everyone that “HPC in-house” advocates were just dinosaurs. The cloud companies all show signs of adjusting their offerings to the actual needs of HPC users (technical, commercial and political needs). This means that an impartial understanding of the pros and cons of cloud for your specific HPC situation is going to be even more critical in 2018. I am certainly being asked to help address the question of HPC in the cloud by my consulting customers with increasing frequency. Azure has been ramping up efforts in HPC (and AI) aggressively over the last few months through acquisitions (e.g., Cycle Computing) and recruitments (e.g., Developer Advocate teams), and I’d expect AWS and Google to do likewise. My prediction is that all three of the major cloud providers (AWS, Azure, Google) will deliver substantially more HPC-relevant solutions in 2018, and at least one will secure a major (and possibly surprising) real HPC customer win.

GPUs

Nvidia also got an unwelcome start to 2018 as they tried to ban (via retrospective changes to license conditions) the use of their cheaper GPUs in datacenter (e.g., HPC, AI, …) applications. Of course, it is no surprise that Nvidia would prefer customers to buy the much more expensive high-end GPUs for datacenter applications. However, it doesn’t say much for the supposedly compelling business case or sales success of the high-end GPUs if they have to force people off the cheaper products first. We (NAG) have done enough benchmarking across enough different customer codes to know that GPUs are flat-out the fastest widely available processor option for codes that can take effective advantage of highly parallel architectures. However, when price of the high-end GPUs is taken into account, plus the performance left on the floor for the non-accelerated codes, then the CPUs often look a better overall choice. Ultimately, adapting many codes to use GPUs (not just a selected few codes to show easy wins) is a big effort. So is adapting workflows to the cloud. With limited resources available, I think users will decide that investing effort in cloud porting is a better long-term return than GPUs. Yes – oddly, I think cloud, not CPUs, will be the pressure that limits the success of GPUs! My prediction is that Nvidia’s unfortunate licensing assertions, coupled with marginal gains in performance relative to total cost of ownership (TCO), plus scarcity of software engineering resources, is that fewer newly deployed on-site HPC systems will be based around GPUs. On the other hand, I think use of GPUs in the cloud, for HPC, will grow substantially in 2018.

Zettascale

Yes, really. After all, exascale is within grasping distance now. We will see multiple systems at >0.1 EF in 2018. Exascale is being talked about in terms of when and which site first, rather than how and which country first. As exascale now seems likely to happen without all those disruptive changes that voices across the community foretold would be critical, computer science researchers and supercomputer center managers will need to start using the zettascale label to drive the next round of funding bids for novel technologies. There have already been a few small gatherings on zettascale, at least as far back as 2004 (!), but I predict 2018 will see the first mainstream meeting with a session focused on zettascale – perhaps at SC18?

Cybersecurity

The consumer world was wracked in 2017 by a range of large scale cybersecurity breaches. The government community has been hit badly in previous years too. Sadly, I see cybersecurity moving up the agenda in the HPC world. Not sad that it is happening, but sad that I think it will be forced to happen by one or more incidents. In general, HPC systems are fairly well protected, largely because they are expensive, capable assets and, in some cases, have regulatory criteria to meet. However, performance and ease-of-use for a predominantly research-led userbase have been the traditional strong drivers of requirements, often meaning the risk management decisions have been tilted towards a minimally compliant security configuration. (Security is arguably one area where HPC-in-the-cloud wins.) My prediction for 2018 is twofold: (1) there will be a major security incident on a high profile HPC system; (2) cybersecurity for HPC will move from a niche topic to a mainstream agenda item for some of the larger HPC conferences.

Finally, Growth

I saw HPC and related things such as AI, cloud, etc., gain lots of momentum in 2017. This included several technologies heralded in confidence finally coming to fruition, new HPC deployments across public and private sectors customers, a notable uptick in our HPC consulting work, interesting personnel moves, and an overall excitement and enthusiasm in the HPC community that had been dulled recently. My final prediction is that 2018 will see this growth and energy in the HPC community gather pace. I look forward to new HPC sites emerging, to significant new HPC systems being announced, and to the growing attention on the broader aspects of HPC beyond FLOPS – people, business aspects, impact stories, and more.

I hope you enjoyed my HPC predictions for 2018. Please do engage with me via Twitter (@hpcnotes) or LinkedIn (www.linkedin.com/in/andrewjones) if you want to comment on my inspired foresight or misguided idiocy. I’ll be back with a follow-up article in a week or two on how you can exploit these predictions to your advantage.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

What’s New in HPC Research: September (Part 1)

September 18, 2018

In this new bimonthly feature, HPCwire will highlight newly published research in the high-performance computing community and related domains. From exascale to quantum computing, the details are here. Check back every Read more…

By Oliver Peckham

House Passes $1.275B National Quantum Initiative

September 17, 2018

Last Thursday the U.S. House of Representatives passed the National Quantum Initiative Act (NQIA) intended to accelerate quantum computing research and development. Among other things it would establish a National Quantu Read more…

By John Russell

Nvidia Accelerates AI Inference in the Datacenter with T4 GPU

September 14, 2018

Nvidia is upping its game for AI inference in the datacenter with a new platform consisting of an inference accelerator chip--the new Turing-based Tesla T4 GPU--and a refresh of its inference server software packaged as Read more…

By George Leopold

HPE Extreme Performance Solutions

Introducing the First Integrated System Management Software for HPC Clusters from HPE

How do you manage your complex, growing cluster environments? Answer that big challenge with the new HPC cluster management solution: HPE Performance Cluster Manager. Read more…

IBM Accelerated Insights

A Crystal Ball for HPC

People are notoriously bad at predicting the future.  This very much includes experts. In the Forbes article “Why Most Predictions Are So Bad” Philip Tetlock discusses the largest and best-known test of the accuracy of expert predictions which show that any experts would do better if they make random guesses. Read more…

NSF Highlights Expanded Efforts for Broadening Participation in Computing

September 13, 2018

Today, the Directorate of Computer and Information Science and Engineering (CISE) of the NSF released a letter highlighting the expansion of its broadening participation in computing efforts. The letter was penned by Jam Read more…

By Staff

House Passes $1.275B National Quantum Initiative

September 17, 2018

Last Thursday the U.S. House of Representatives passed the National Quantum Initiative Act (NQIA) intended to accelerate quantum computing research and developm Read more…

By John Russell

Nvidia Accelerates AI Inference in the Datacenter with T4 GPU

September 14, 2018

Nvidia is upping its game for AI inference in the datacenter with a new platform consisting of an inference accelerator chip--the new Turing-based Tesla T4 GPU- Read more…

By George Leopold

DeepSense Combines HPC and AI to Bolster Canada’s Ocean Economy

September 13, 2018

We often hear scientists say that we know less than 10 percent of the life of the oceans. This week, IBM and a group of Canadian industry and government partner Read more…

By Tiffany Trader

Rigetti (and Others) Pursuit of Quantum Advantage

September 11, 2018

Remember ‘quantum supremacy’, the much-touted but little-loved idea that the age of quantum computing would be signaled when quantum computers could tackle Read more…

By John Russell

How FPGAs Accelerate Financial Services Workloads

September 11, 2018

While FSI companies are unlikely, for competitive reasons, to disclose their FPGA strategies, James Reinders offers insights into the case for FPGAs as accelerators for FSI by discussing performance, power, size, latency, jitter and inline processing. Read more…

By James Reinders

Update from Gregory Kurtzer on Singularity’s Push into FS and the Enterprise

September 11, 2018

Container technology is hardly new but it has undergone rapid evolution in the HPC space in recent years to accommodate traditional science workloads and HPC systems requirements. While Docker containers continue to dominate in the enterprise, other variants are becoming important and one alternative with distinctly HPC roots – Singularity – is making an enterprise push targeting advanced scale workload inclusive of HPC. Read more…

By John Russell

At HPC on Wall Street: AI-as-a-Service Accelerates AI Journeys

September 10, 2018

AIaaS – artificial intelligence-as-a-service – is the technology discipline that eases enterprise entry into the mysteries of the AI journey while lowering Read more…

By Doug Black

No Go for GloFo at 7nm; and the Fujitsu A64FX post-K CPU

September 5, 2018

It’s been a news worthy couple of weeks in the semiconductor and HPC industry. There were several HPC relevant disclosures at Hot Chips 2018 to whet appetites Read more…

By Dairsie Latimer

TACC Wins Next NSF-funded Major Supercomputer

July 30, 2018

The Texas Advanced Computing Center (TACC) has won the next NSF-funded big supercomputer beating out rivals including the National Center for Supercomputing Ap Read more…

By John Russell

IBM at Hot Chips: What’s Next for Power

August 23, 2018

With processor, memory and networking technologies all racing to fill in for an ailing Moore’s law, the era of the heterogeneous datacenter is well underway, Read more…

By Tiffany Trader

Requiem for a Phi: Knights Landing Discontinued

July 25, 2018

On Monday, Intel made public its end of life strategy for the Knights Landing "KNL" Phi product set. The announcement makes official what has already been wide Read more…

By Tiffany Trader

CERN Project Sees Orders-of-Magnitude Speedup with AI Approach

August 14, 2018

An award-winning effort at CERN has demonstrated potential to significantly change how the physics based modeling and simulation communities view machine learni Read more…

By Rob Farber

ORNL Summit Supercomputer Is Officially Here

June 8, 2018

Oak Ridge National Laboratory (ORNL) together with IBM and Nvidia celebrated the official unveiling of the Department of Energy (DOE) Summit supercomputer toda Read more…

By Tiffany Trader

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

By John Russell

AMD’s EPYC Road to Redemption in Six Slides

June 21, 2018

A year ago AMD returned to the server market with its EPYC processor line. The earth didn’t tremble but folks took notice. People remember the Opteron fondly Read more…

By John Russell

MLPerf – Will New Machine Learning Benchmark Help Propel AI Forward?

May 2, 2018

Let the AI benchmarking wars begin. Today, a diverse group from academia and industry – Google, Baidu, Intel, AMD, Harvard, and Stanford among them – releas Read more…

By John Russell

Leading Solution Providers

SC17 Booth Video Tours Playlist

Altair @ SC17

Altair

AMD @ SC17

AMD

ASRock Rack @ SC17

ASRock Rack

CEJN @ SC17

CEJN

DDN Storage @ SC17

DDN Storage

Huawei @ SC17

Huawei

IBM @ SC17

IBM

IBM Power Systems @ SC17

IBM Power Systems

Intel @ SC17

Intel

Lenovo @ SC17

Lenovo

Mellanox Technologies @ SC17

Mellanox Technologies

Microsoft @ SC17

Microsoft

Penguin Computing @ SC17

Penguin Computing

Pure Storage @ SC17

Pure Storage

Supericro @ SC17

Supericro

Tyan @ SC17

Tyan

Univa @ SC17

Univa

Pattern Computer – Startup Claims Breakthrough in ‘Pattern Discovery’ Technology

May 23, 2018

If it weren’t for the heavy-hitter technology team behind start-up Pattern Computer, which emerged from stealth today in a live-streamed event from San Franci Read more…

By John Russell

Sandia to Take Delivery of World’s Largest Arm System

June 18, 2018

While the enterprise remains circumspect on prospects for Arm servers in the datacenter, the leadership HPC community is taking a bolder, brighter view of the x86 server CPU alternative. Amongst current and planned Arm HPC installations – i.e., the innovative Mont-Blanc project, led by Bull/Atos, the 'Isambard’ Cray XC50 going into the University of Bristol, and commitments from both Japan and France among others -- HPE is announcing that it will be supply the United States National Nuclear Security Administration (NNSA) with a 2.3 petaflops peak Arm-based system, named Astra. Read more…

By Tiffany Trader

D-Wave Breaks New Ground in Quantum Simulation

July 16, 2018

Last Friday D-Wave scientists and colleagues published work in Science which they say represents the first fulfillment of Richard Feynman’s 1982 notion that Read more…

By John Russell

Intel Pledges First Commercial Nervana Product ‘Spring Crest’ in 2019

May 24, 2018

At its AI developer conference in San Francisco yesterday, Intel embraced a holistic approach to AI and showed off a broad AI portfolio that includes Xeon processors, Movidius technologies, FPGAs and Intel’s Nervana Neural Network Processors (NNPs), based on the technology it acquired in 2016. Read more…

By Tiffany Trader

Intel Announces Cooper Lake, Advances AI Strategy

August 9, 2018

Intel's chief datacenter exec Navin Shenoy kicked off the company's Data-Centric Innovation Summit Wednesday, the day-long program devoted to Intel's datacenter Read more…

By Tiffany Trader

TACC’s ‘Frontera’ Supercomputer Expands Horizon for Extreme-Scale Science

August 29, 2018

The National Science Foundation and the Texas Advanced Computing Center announced today that a new system, called Frontera, will overtake Stampede 2 as the fast Read more…

By Tiffany Trader

GPUs Power Five of World’s Top Seven Supercomputers

June 25, 2018

The top 10 echelon of the newly minted Top500 list boasts three powerful new systems with one common engine: the Nvidia Volta V100 general-purpose graphics proc Read more…

By Tiffany Trader

The Machine Learning Hype Cycle and HPC

June 14, 2018

Like many other HPC professionals I’m following the hype cycle around machine learning/deep learning with interest. I subscribe to the view that we’re probably approaching the ‘peak of inflated expectation’ but not quite yet starting the descent into the ‘trough of disillusionment. This still raises the probability that... Read more…

By Dairsie Latimer

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This