Compilers and More: The Dangers of COTS Supercomputing

By Michael Wolfe

April 4, 2008

One of the last events at Supercomputing 2007 (SC07) was a panel titled “(Super)Computing on FPGAs, GPUs, Cell and Other Exotic Architectures” on Friday morning.

Jack Dongarra (Univ. Tennessee and ORNL) said that the HPC ecosystem is out of balance; we’ve invested heavily in hardware development, and now we need to invest more heavily in software tools and methods to use the hardware. Rob Pennington (NCSA), the panel moderator, said that the tools will appear when there are enough of these systems out there that the vendors can make money at it. I disagree with both these statements.

In response to Jack Dongarra’s statement, I agree that the investment in software tools for high performance computing has been lacking, but it’s been equally limited for hardware. While I didn’t do a comprehensive survey on the exhibit show floor at SC07 in Reno, almost all the machines displayed there were built from COTS (commodity off-the-shelf) processors, mostly x86-64 from Intel and AMD, some PowerPC from IBM, and in some cases, SPARC and MIPS. Any innovation seems to be in the interconnect, packaging, power, and cooling. Notable exceptions are traditional vector supercomputers from NEC and Cray, and the ClearSpeed accelerators. It seems the HPC market can’t support processor development; current process technology is just too expensive.

There is a great deal of hype and promise for accelerators. However, even here we depend on the commodity market to drive the technology and development, and hope to gain what benefit we can. We are in the dangerous position of depending on the scraps that fall off the PlayStation table — and if they take their picnic and go somewhere else, we’re in real trouble. If you think this is silly, try asking NVIDIA to add a feature to their graphics cards that will speed up your application but will hurt graphics performance. I can hear the laughter already.

Of more concern is what may happen with the mainstream processor business. AMD and Intel have already announced quad-core chips, with plans for eight and more. David Scott (Intel), at a focus session in the HP-CAST user group meeting the Saturday prior to SC07, noted that if you are willing to give up single-core performance, you can put a lot of cores on a single chip, with today’s technology. There are many applications where such a strategy makes a great deal of sense: web services, database transactions — anything that responds to many small, independent requests. Think Google. In fact, most computing might fall into that market, where single thread performance doesn’t matter, only the total throughput.

But not HPC. Imagine having to expose and manage five or ten times more parallelism just to deliver the same performance as a single thread today. To get actual performance improvement, you need yet another factor of parallelism.

But guess who will win that architecture argument.

As for software, the dominant programming model for parallel computers hasn’t changed in almost 20 years, except to replace PVM with MPI. (I count substituting C or C++ for Fortran as a giant step sideways.) Perhaps this is inevitable. Douglass Post (DoD, HPCMP) pointed out at the SC07 panel that the lifetime of a large code is 20 to 30 years, whereas the lifetime of any large HPC system is more like 3 to 4 years. Portability, including performance portability, is more important than peak performance on any one system.

One of PGI’s consultants told us that today’s programmers like the MPI model, if only because it makes their lives easier. They can concentrate on porting and tuning today’s algorithms and programs to MPI, which is a lot of work, but not too mentally demanding. If we move to a model where parallel programming is less work, they’ll have to take on the task of finding better parallel algorithms, which is much more challenging.

So, to correct Jack Dongarra, the problem isn’t balance. The HPC ecosystem is in perfect balance, with little investment and innovation in both hardware and software. We’re in a precarious position now.  The community is able to benefit from the COTS market, but it’s anyone’s guess how long we’ll be able to thrive there.

In response to Rob Pennington, I believe that the HPC market is too small to support an aggressive hardware business, and it’s equally true that it’s too small to support a software tools industry. It may be hard to justify the cost of a large HPC hardware installation, but at least you can proudly give tours of the machine room. It’s hard to justify a large software budget, when all you get is a CD and a book (if you’re lucky).

Take compilers as an example, something near and dear to my heart. Historically, compiler development was taken on by the processor vendor and subsidized by that business. Compilers — and operating systems — hardly generated enough revenue to pay for themselves, but they were strategic investments by the vendors. Today’s HPC compilers are supported by the workstation business, and largely driven by it.

The hope has been that workstations were as complex today as yesteryear’s supercomputers, and need the same complex compilers and tools. So there is a natural fit in requirements and solutions. But some tools are hard to build, notably compilers. If compilers were easy, we wouldn’t have library-based solutions (BLAS, Linpack, MPI, etc.), we’d have extended the languages and compilers to solve those problems. Creating, supporting, and supplying these tools is a big investment and commitment. In almost every problem space, a software vendor can make more money applying that investment and commitment to a larger market than HPC. If HPC users will also buy it, that’s great, but it’s not enough to drive the market. I’m sure that statement will produce a plethora of rebuttals from HPC software vendors, but I’d ask how much of the revenue for those products is for non-HPC platforms.

Many HPC sites act as if they believe they can (or have to) develop all their own software internally. They’ve become a community of blacksmiths, building their own tools, and proud of it, with little need or desire for third party software. To be fair, the HPC market is volatile enough that a certain amount of FUD about dependence on independent software vendors can be justified.

To correct Rob Pennington, the tools will appear only if and when they apply to a larger market, or if some company (unlikely) or government agency (perhaps likely) chooses to make a long-term strategic investment.

—–

Michael Wolfe has developed compilers for over 30 years in both academia and industry, and is now a senior compiler engineer at The Portland Group, Inc. (www.pgroup.com), a wholly-owned subsidiary of STMicroelectronics, Inc. The opinions stated here are those of the author, and do not represent opinions of The Portland Group, Inc. or STMicroelectronics, Inc.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

PRACEdays Reflects Europe’s HPC Commitment

May 25, 2017

More than 250 attendees and participants came together for PRACEdays17 in Barcelona last week, part of the European HPC Summit Week 2017, held May 15-19 at t Read more…

By Tiffany Trader

Russian Researchers Claim First Quantum-Safe Blockchain

May 25, 2017

The Russian Quantum Center today announced it has overcome the threat of quantum cryptography by creating the first quantum-safe blockchain, securing cryptocurr Read more…

By Doug Black

Google Debuts TPU v2 and will Add to Google Cloud

May 25, 2017

Not long after stirring attention in the deep learning/AI community by revealing the details of its Tensor Processing Unit (TPU), Google last week announced the Read more…

By John Russell

Nvidia CEO Predicts AI ‘Cambrian Explosion’

May 25, 2017

The processing power and cloud access to developer tools used to train machine-learning models are making artificial intelligence ubiquitous across computing pl Read more…

By George Leopold

HPE Extreme Performance Solutions

Exploring the Three Models of Remote Visualization

The explosion of data and advancement of digital technologies are dramatically changing the way many companies do business. With the help of high performance computing (HPC) solutions and data analytics platforms, manufacturers are developing products faster, healthcare providers are improving patient care, and energy companies are improving planning, exploration, and production. Read more…

PGAS Use will Rise on New H/W Trends, Says Reinders

May 25, 2017

If you have not already tried using PGAS, it is time to consider adding PGAS to the programming techniques you know. Partitioned Global Array Space, commonly kn Read more…

By James Reinders

Exascale Escapes 2018 Budget Axe; Rest of Science Suffers

May 23, 2017

President Trump's proposed $4.1 trillion FY 2018 budget is good for U.S. exascale computing development, but grim for the rest of science and technology spend Read more…

By Tiffany Trader

Hedge Funds (with Supercomputing help) Rank First Among Investors

May 22, 2017

In case you didn’t know, The Quants Run Wall Street Now, or so says a headline in today’s Wall Street Journal. Quant-run hedge funds now control the largest Read more…

By John Russell

IBM, D-Wave Report Quantum Computing Advances

May 18, 2017

IBM said this week it has built and tested a pair of quantum computing processors, including a prototype of a commercial version. That progress follows an an Read more…

By George Leopold

PRACEdays Reflects Europe’s HPC Commitment

May 25, 2017

More than 250 attendees and participants came together for PRACEdays17 in Barcelona last week, part of the European HPC Summit Week 2017, held May 15-19 at t Read more…

By Tiffany Trader

PGAS Use will Rise on New H/W Trends, Says Reinders

May 25, 2017

If you have not already tried using PGAS, it is time to consider adding PGAS to the programming techniques you know. Partitioned Global Array Space, commonly kn Read more…

By James Reinders

Exascale Escapes 2018 Budget Axe; Rest of Science Suffers

May 23, 2017

President Trump's proposed $4.1 trillion FY 2018 budget is good for U.S. exascale computing development, but grim for the rest of science and technology spend Read more…

By Tiffany Trader

Cray Offers Supercomputing as a Service, Targets Biotechs First

May 16, 2017

Leading supercomputer vendor Cray and datacenter/cloud provider the Markley Group today announced plans to jointly deliver supercomputing as a service. The init Read more…

By John Russell

HPE’s Memory-centric The Machine Coming into View, Opens ARMs to 3rd-party Developers

May 16, 2017

Announced three years ago, HPE’s The Machine is said to be the largest R&D program in the venerable company’s history, one that could be progressing tow Read more…

By Doug Black

What’s Up with Hyperion as It Transitions From IDC?

May 15, 2017

If you’re wondering what’s happening with Hyperion Research – formerly the IDC HPC group – apparently you are not alone, says Steve Conway, now senior V Read more…

By John Russell

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

HPE Launches Servers, Services, and Collaboration at GTC

May 10, 2017

Hewlett Packard Enterprise (HPE) today launched a new liquid cooled GPU-driven Apollo platform based on SGI ICE architecture, a new collaboration with NVIDIA, a Read more…

By John Russell

Quantum Bits: D-Wave and VW; Google Quantum Lab; IBM Expands Access

March 21, 2017

For a technology that’s usually characterized as far off and in a distant galaxy, quantum computing has been steadily picking up steam. Just how close real-wo Read more…

By John Russell

Trump Budget Targets NIH, DOE, and EPA; No Mention of NSF

March 16, 2017

President Trump’s proposed U.S. fiscal 2018 budget issued today sharply cuts science spending while bolstering military spending as he promised during the cam Read more…

By John Russell

Google Pulls Back the Covers on Its First Machine Learning Chip

April 6, 2017

This week Google released a report detailing the design and performance characteristics of the Tensor Processing Unit (TPU), its custom ASIC for the inference Read more…

By Tiffany Trader

HPC Compiler Company PathScale Seeks Life Raft

March 23, 2017

HPCwire has learned that HPC compiler company PathScale has fallen on difficult times and is asking the community for help or actively seeking a buyer for its a Read more…

By Tiffany Trader

CPU-based Visualization Positions for Exascale Supercomputing

March 16, 2017

Since our first formal product releases of OSPRay and OpenSWR libraries in 2016, CPU-based Software Defined Visualization (SDVis) has achieved wide-spread adopt Read more…

By Jim Jeffers, Principal Engineer and Engineering Leader, Intel

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Last week, Google reported that its custom ASIC Tensor Processing Unit (TPU) was 15-30x faster for inferencing workloads than Nvidia's K80 GPU (see our coverage Read more…

By Tiffany Trader

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

TSUBAME3.0 Points to Future HPE Pascal-NVLink-OPA Server

February 17, 2017

Since our initial coverage of the TSUBAME3.0 supercomputer yesterday, more details have come to light on this innovative project. Of particular interest is a ne Read more…

By Tiffany Trader

Leading Solution Providers

Facebook Open Sources Caffe2; Nvidia, Intel Rush to Optimize

April 18, 2017

From its F8 developer conference in San Jose, Calif., today, Facebook announced Caffe2, a new open-source, cross-platform framework for deep learning. Caffe2 is Read more…

By Tiffany Trader

Tokyo Tech’s TSUBAME3.0 Will Be First HPE-SGI Super

February 16, 2017

In a press event Friday afternoon local time in Japan, Tokyo Institute of Technology (Tokyo Tech) announced its plans for the TSUBAME3.0 supercomputer, which w Read more…

By Tiffany Trader

Is Liquid Cooling Ready to Go Mainstream?

February 13, 2017

Lost in the frenzy of SC16 was a substantial rise in the number of vendors showing server oriented liquid cooling technologies. Three decades ago liquid cooling Read more…

By Steve Campbell

MIT Mathematician Spins Up 220,000-Core Google Compute Cluster

April 21, 2017

On Thursday, Google announced that MIT math professor and computational number theorist Andrew V. Sutherland had set a record for the largest Google Compute Eng Read more…

By Tiffany Trader

US Supercomputing Leaders Tackle the China Question

March 15, 2017

As China continues to prove its supercomputing mettle via the Top500 list and the forward march of its ambitious plans to stand up an exascale machine by 2020, Read more…

By Tiffany Trader

HPC Technique Propels Deep Learning at Scale

February 21, 2017

Researchers from Baidu's Silicon Valley AI Lab (SVAIL) have adapted a well-known HPC communication technique to boost the speed and scale of their neural networ Read more…

By Tiffany Trader

DOE Supercomputer Achieves Record 45-Qubit Quantum Simulation

April 13, 2017

In order to simulate larger and larger quantum systems and usher in an age of "quantum supremacy," researchers are stretching the limits of today's most advance Read more…

By Tiffany Trader

Knights Landing Processor with Omni-Path Makes Cloud Debut

April 18, 2017

HPC cloud specialist Rescale is partnering with Intel and HPC resource provider R Systems to offer first-ever cloud access to Xeon Phi "Knights Landing" process Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Share This