The @hpcnotes Predictions for HPC in 2018

By Andrew Jones

January 4, 2018

I’m not averse to making predictions about the world of High Performance Computing (and Supercomputing, Cloud, etc.) in person at conferences, meetings, causal conversations, etc.; however, it turns out to be a while since I have stuck my neck out and widely published my predictions for the year ahead in HPC. Of course, such predictions tend to be evenly split between inspired foresight and misguided idiocy. At least some of the predictions will have readers spluttering coffee in indignation at how wrong I am. But, where would the fun in HPC be if we all played safe? So, here goes for the @hpcnotes predictions for HPC in 2018 …

Intel

After spending much of 2017 being called out for ambitiously high pricing of Skylake for HPC customers, and following that with the months of Xeon Phi confusion – eventually publicly admitting at SC17 that Knights Hill has been cancelled, still not clear about the future of Phi overall – Intel seems to have continued into 2018 in the worst way, with news of kernel memory hardware bugs flooding the IT news and social media space. [NB: these bugs have now been confirmed to affect CPUs from AMD, ARM and other vendors too.] 2018 will also see widespread availability of AMD EPYC, Cavium ThunderX2, and IBM Power9 processors and so it seems Intel has a tough year ahead. The hardware bug is especially painful here as it negates the “Intel is the safe option” thinking. To be clear, HPC community consensus so far (including NAG’s impartial benchmarking work with customer codes) says Skylake is a very capable and performance leading processor. However, Skylake has three possible let downs: (1) price substantially higher, relative to the benefits gained, than customers are comfortable with; (2) reduced cache per core compared with other CPUs; (3) dependence on a code’s saturation of the vector units to extract the maximum performance. In some early benchmarks, EPYC and TX2 are winning on both price and performance. My prediction is that Intel will meaningfully drop the Skylake price early in 2018 to pull back into a competitive position on price/performance.

AI and ML

Sorry, the media and marketing hype for AI/ML taking over HPC shows no sign of going away. Yes, there are many real use cases for AI and ML (e.g., follow Paige Bailey and colleagues for real examples); however, the aggressive insertion of AI and ML labels into every HPC-related conference agenda (taking over from the mandatory mentions of Big Data) doesn’t add a lot of value, I think. I’m not suggesting that the HPC community (users or providers) ignore AI/ML – indeed, I would firmly advocate that you add these to your portfolio. But, HPC is an exceptionally powerful and widely applicable tool in its own right – it doesn’t need AI/ML to justify itself. My prediction is that AI/ML will continue to hog a share of the HPC marketing noise unrelated to the scale of actual use in the HPC arena.

New processors

As noted above, 2018 sees credible HPC processors from AMD (EPYC), Cavium (ThunderX2) and other ARM chips, and IBM (Power9) surge into general availability. In my view, these are not (yet) competing with Intel Xeon; they are competing with each other to be the best of the rest. Depending on how Intel behaves (NB: this is not just about technology) and how well AMD/ARM/IBM and their system partners actually execute on promises, one of these might close out 2018 being a serious competitor to Intel’s dominance of the HPC processor space. Either way, I predict we will see at least one meaningful (i.e., competitively won, large scale, for production use) HPC deployment of each of these processors in 2018. I’m also going to add a second prediction to this section: a MIPS based processor option will start to gain headlines as a real HPC processor candidate in 2018 (not just in China).

Cloud

In most cases, HPC is still cheaper and more capable through traditional in-house systems than via cloud deployments. No amount of marketing changes that. Time might change it, but not by the end of 2018. However, cloud as an option for HPC is not going away. It does present a real option for many HPC workloads, and not just trivial workloads. I am hopeful we are at the end of the era where the cloud providers hoped to succeed by trying to convince everyone that “HPC in-house” advocates were just dinosaurs. The cloud companies all show signs of adjusting their offerings to the actual needs of HPC users (technical, commercial and political needs). This means that an impartial understanding of the pros and cons of cloud for your specific HPC situation is going to be even more critical in 2018. I am certainly being asked to help address the question of HPC in the cloud by my consulting customers with increasing frequency. Azure has been ramping up efforts in HPC (and AI) aggressively over the last few months through acquisitions (e.g., Cycle Computing) and recruitments (e.g., Developer Advocate teams), and I’d expect AWS and Google to do likewise. My prediction is that all three of the major cloud providers (AWS, Azure, Google) will deliver substantially more HPC-relevant solutions in 2018, and at least one will secure a major (and possibly surprising) real HPC customer win.

GPUs

Nvidia also got an unwelcome start to 2018 as they tried to ban (via retrospective changes to license conditions) the use of their cheaper GPUs in datacenter (e.g., HPC, AI, …) applications. Of course, it is no surprise that Nvidia would prefer customers to buy the much more expensive high-end GPUs for datacenter applications. However, it doesn’t say much for the supposedly compelling business case or sales success of the high-end GPUs if they have to force people off the cheaper products first. We (NAG) have done enough benchmarking across enough different customer codes to know that GPUs are flat-out the fastest widely available processor option for codes that can take effective advantage of highly parallel architectures. However, when price of the high-end GPUs is taken into account, plus the performance left on the floor for the non-accelerated codes, then the CPUs often look a better overall choice. Ultimately, adapting many codes to use GPUs (not just a selected few codes to show easy wins) is a big effort. So is adapting workflows to the cloud. With limited resources available, I think users will decide that investing effort in cloud porting is a better long-term return than GPUs. Yes – oddly, I think cloud, not CPUs, will be the pressure that limits the success of GPUs! My prediction is that Nvidia’s unfortunate licensing assertions, coupled with marginal gains in performance relative to total cost of ownership (TCO), plus scarcity of software engineering resources, is that fewer newly deployed on-site HPC systems will be based around GPUs. On the other hand, I think use of GPUs in the cloud, for HPC, will grow substantially in 2018.

Zettascale

Yes, really. After all, exascale is within grasping distance now. We will see multiple systems at >0.1 EF in 2018. Exascale is being talked about in terms of when and which site first, rather than how and which country first. As exascale now seems likely to happen without all those disruptive changes that voices across the community foretold would be critical, computer science researchers and supercomputer center managers will need to start using the zettascale label to drive the next round of funding bids for novel technologies. There have already been a few small gatherings on zettascale, at least as far back as 2004 (!), but I predict 2018 will see the first mainstream meeting with a session focused on zettascale – perhaps at SC18?

Cybersecurity

The consumer world was wracked in 2017 by a range of large scale cybersecurity breaches. The government community has been hit badly in previous years too. Sadly, I see cybersecurity moving up the agenda in the HPC world. Not sad that it is happening, but sad that I think it will be forced to happen by one or more incidents. In general, HPC systems are fairly well protected, largely because they are expensive, capable assets and, in some cases, have regulatory criteria to meet. However, performance and ease-of-use for a predominantly research-led userbase have been the traditional strong drivers of requirements, often meaning the risk management decisions have been tilted towards a minimally compliant security configuration. (Security is arguably one area where HPC-in-the-cloud wins.) My prediction for 2018 is twofold: (1) there will be a major security incident on a high profile HPC system; (2) cybersecurity for HPC will move from a niche topic to a mainstream agenda item for some of the larger HPC conferences.

Finally, Growth

I saw HPC and related things such as AI, cloud, etc., gain lots of momentum in 2017. This included several technologies heralded in confidence finally coming to fruition, new HPC deployments across public and private sectors customers, a notable uptick in our HPC consulting work, interesting personnel moves, and an overall excitement and enthusiasm in the HPC community that had been dulled recently. My final prediction is that 2018 will see this growth and energy in the HPC community gather pace. I look forward to new HPC sites emerging, to significant new HPC systems being announced, and to the growing attention on the broader aspects of HPC beyond FLOPS – people, business aspects, impact stories, and more.

I hope you enjoyed my HPC predictions for 2018. Please do engage with me via Twitter (@hpcnotes) or LinkedIn (www.linkedin.com/in/andrewjones) if you want to comment on my inspired foresight or misguided idiocy. I’ll be back with a follow-up article in a week or two on how you can exploit these predictions to your advantage.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

UCSD, AIST Forge Tighter Alliance with AI-Focused MOU

January 18, 2018

The rich history of collaboration between UC San Diego and AIST in Japan is getting richer. The organizations entered into a five-year memorandum of understanding on January 10. The MOU represents the continuation of a 1 Read more…

By Tiffany Trader

New Blueprint for Converging HPC, Big Data

January 18, 2018

After five annual workshops on Big Data and Extreme-Scale Computing (BDEC), a group of international HPC heavyweights including Jack Dongarra (University of Tennessee), Satoshi Matsuoka (Tokyo Institute of Technology), Read more…

By John Russell

Researchers Measure Impact of ‘Meltdown’ and ‘Spectre’ Patches on HPC Workloads

January 17, 2018

Computer scientists from the Center for Computational Research, State University of New York (SUNY), University at Buffalo have examined the effect of Meltdown and Spectre security updates on the performance of popular H Read more…

By Tiffany Trader

HPE Extreme Performance Solutions

HPE and NREL Take Steps to Create a Sustainable, Energy-Efficient Data Center with an H2 Fuel Cell

As enterprises attempt to manage rising volumes of data, unplanned data center outages are becoming more common and more expensive. As the cost of downtime rises, enterprises lose out on productivity and valuable competitive advantage without access to their critical data. Read more…

Fostering Lustre Advancement Through Development and Contributions

January 17, 2018

Six months after organizational changes at Intel's High Performance Data (HPDD) division, most in the Lustre community have shed any initial apprehension around the potential changes that could affect or disrupt Lustre Read more…

By Carlos Aoki Thomaz

UCSD, AIST Forge Tighter Alliance with AI-Focused MOU

January 18, 2018

The rich history of collaboration between UC San Diego and AIST in Japan is getting richer. The organizations entered into a five-year memorandum of understandi Read more…

By Tiffany Trader

New Blueprint for Converging HPC, Big Data

January 18, 2018

After five annual workshops on Big Data and Extreme-Scale Computing (BDEC), a group of international HPC heavyweights including Jack Dongarra (University of Te Read more…

By John Russell

Researchers Measure Impact of ‘Meltdown’ and ‘Spectre’ Patches on HPC Workloads

January 17, 2018

Computer scientists from the Center for Computational Research, State University of New York (SUNY), University at Buffalo have examined the effect of Meltdown Read more…

By Tiffany Trader

Fostering Lustre Advancement Through Development and Contributions

January 17, 2018

Six months after organizational changes at Intel's High Performance Data (HPDD) division, most in the Lustre community have shed any initial apprehension aroun Read more…

By Carlos Aoki Thomaz

When the Chips Are Down

January 11, 2018

In the last article, "The High Stakes Semiconductor Game that Drives HPC Diversity," I alluded to the challenges facing the semiconductor industry and how that may impact the evolution of HPC systems over the next few years. I thought I’d lift the covers a little and look at some of the commercial challenges that impact the component technology we use in HPC. Read more…

By Dairsie Latimer

How Meltdown and Spectre Patches Will Affect HPC Workloads

January 10, 2018

There have been claims that the fixes for the Meltdown and Spectre security vulnerabilities, named the KPTI (aka KAISER) patches, are going to affect applicatio Read more…

By Rosemary Francis

Momentum Builds for US Exascale

January 9, 2018

2018 looks to be a great year for the U.S. exascale program. The last several months of 2017 revealed a number of important developments that help put the U.S. Read more…

By Alex R. Larzelere

ANL’s Rick Stevens on CANDLE, ARM, Quantum, and More

January 8, 2018

Late last year HPCwire caught up with Rick Stevens, associate laboratory director for computing, environment and life Sciences at Argonne National Laboratory, f Read more…

By John Russell

Inventor Claims to Have Solved Floating Point Error Problem

January 17, 2018

"The decades-old floating point error problem has been solved," proclaims a press release from inventor Alan Jorgensen. The computer scientist has filed for and Read more…

By Tiffany Trader

US Coalesces Plans for First Exascale Supercomputer: Aurora in 2021

September 27, 2017

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, in Arlington, Va., yesterday (Sept. 26), it was revealed that the "Aurora" supercompute Read more…

By Tiffany Trader

Japan Unveils Quantum Neural Network

November 22, 2017

The U.S. and China are leading the race toward productive quantum computing, but it's early enough that ultimate leadership is still something of an open questi Read more…

By Tiffany Trader

AMD Showcases Growing Portfolio of EPYC and Radeon-based Systems at SC17

November 13, 2017

AMD’s charge back into HPC and the datacenter is on full display at SC17. Having launched the EPYC processor line in June along with its MI25 GPU the focus he Read more…

By John Russell

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

IBM Begins Power9 Rollout with Backing from DOE, Google

December 6, 2017

After over a year of buildup, IBM is unveiling its first Power9 system based on the same architecture as the Department of Energy CORAL supercomputers, Summit a Read more…

By Tiffany Trader

Fast Forward: Five HPC Predictions for 2018

December 21, 2017

What’s on your list of high (and low) lights for 2017? Volta 100’s arrival on the heels of the P100? Appearance, albeit late in the year, of IBM’s Power9? Read more…

By John Russell

GlobalFoundries Puts Wind in AMD’s Sails with 12nm FinFET

September 24, 2017

From its annual tech conference last week (Sept. 20), where GlobalFoundries welcomed more than 600 semiconductor professionals (reaching the Santa Clara venue Read more…

By Tiffany Trader

Leading Solution Providers

Chip Flaws ‘Meltdown’ and ‘Spectre’ Loom Large

January 4, 2018

The HPC and wider tech community have been abuzz this week over the discovery of critical design flaws that impact virtually all contemporary microprocessors. T Read more…

By Tiffany Trader

Perspective: What Really Happened at SC17?

November 22, 2017

SC is over. Now comes the myriad of follow-ups. Inboxes are filled with templated emails from vendors and other exhibitors hoping to win a place in the post-SC thinking of booth visitors. Attendees of tutorials, workshops and other technical sessions will be inundated with requests for feedback. Read more…

By Andrew Jones

Tensors Come of Age: Why the AI Revolution Will Help HPC

November 13, 2017

Thirty years ago, parallel computing was coming of age. A bitter battle began between stalwart vector computing supporters and advocates of various approaches to parallel computing. IBM skeptic Alan Karp, reacting to announcements of nCUBE’s 1024-microprocessor system and Thinking Machines’ 65,536-element array, made a public $100 wager that no one could get a parallel speedup of over 200 on real HPC workloads. Read more…

By John Gustafson & Lenore Mullin

Delays, Smoke, Records & Markets – A Candid Conversation with Cray CEO Peter Ungaro

October 5, 2017

Earlier this month, Tom Tabor, publisher of HPCwire and I had a very personal conversation with Cray CEO Peter Ungaro. Cray has been on something of a Cinderell Read more…

By Tiffany Trader & Tom Tabor

Flipping the Flops and Reading the Top500 Tea Leaves

November 13, 2017

The 50th edition of the Top500 list, the biannual publication of the world’s fastest supercomputers based on public Linpack benchmarking results, was released Read more…

By Tiffany Trader

GlobalFoundries, Ayar Labs Team Up to Commercialize Optical I/O

December 4, 2017

GlobalFoundries (GF) and Ayar Labs, a startup focused on using light, instead of electricity, to transfer data between chips, today announced they've entered in Read more…

By Tiffany Trader

How Meltdown and Spectre Patches Will Affect HPC Workloads

January 10, 2018

There have been claims that the fixes for the Meltdown and Spectre security vulnerabilities, named the KPTI (aka KAISER) patches, are going to affect applicatio Read more…

By Rosemary Francis

HPC Chips – A Veritable Smorgasbord?

October 10, 2017

For the first time since AMD's ill-fated launch of Bulldozer the answer to the question, 'Which CPU will be in my next HPC system?' doesn't have to be 'Whichever variety of Intel Xeon E5 they are selling when we procure'. Read more…

By Dairsie Latimer

  • arrow
  • Click Here for More Headlines
  • arrow
Share This