One of the things I really enjoy about the annual HPCC conference in Newport is the intimate level of conversation that continues from the meeting hall to the networking breaks to the social events. Even as the bartenders are sounding “Last Call” you can count on overhearing conversations about technical and political challenges in the hallways and lobby area of the Goat Island Hyatt Regency, where the conference is held each year.
According to my very unscientific survey, two people were responsible for spawning more discussion this year than any others. And I mean that in a commendable and congratulatory way. The doctors were in the house — Michael Wolfe from PGI and Stephen Wheat from Intel.
Dr. Michael Wolfe of the Portland Group, a very well known compiler engineer with a lot of credibility in this community, introduced a very interesting and controversial topic. Michael raised the issue of some serious problems on the horizon for HPC — including the fact that multicore has introduced a whole new set of challenges for achieving efficient code parallelization.
He brought up a number of interesting points laying out the history of HPC tools, from the “old days” when the tools were supplied by the hardware vendors, to where we are today, in a commodity ecosystem of HPC platforms, which for the most part consist of large clusters of COTS parts with some innovation applied to packaging and interconnects. Michael made the point that the tools used to develop the applications for today’s HPC clusters are the same tools used for single-user workstations with just a few additions. The tools are inexpensive or open source with ongoing development costs supported by the workstation market.
Michael’s question that spawned much discussion was this: “If today’s system vendors can’t afford to provide new HPC tools, and today’s users aren’t going to pay for new HPC tools, how then will they be developed… and without a new generation of HPC tools, can we ever build highly effective applications for new multicore environments?”
A number of conversations on this topic continued during the networking breaks and into the late night hours in the wandering hotel lounge that somehow changed location each evening. In one conversation I overheard, several folks were agreeing that the challenge is so daunting that, as a community, we need to rally around a new approach to HPC architecture. One person used the analogy of buying a new high-performance sports car, equipped with a 15-year old engine that was upgraded with a number of new options — but with each option came some new problems.
And to quote Michael Wolfe one more time, “There’s got to be a better way to go massively parallel… doesn’t there?”
No stranger to this conference, Dr. Stephen Wheat from Intel delivered a presentation designed to help the attendees put their arms around a number of the challenges faced by this community. His well-organized and well-articulated discussion spawned a number of side conversations on how this community can continue to meet the demanding performance requirements in a market with an insatiable appetite.
Stephen also emphasized Intel’s long-term commitment to HPC. We all know that the company’s commitment to HPC has been questioned from time to time over the years, but as the ecosystem expands and the traditional perceived boundaries of HPC fade away, the stakes are bigger than ever before for Intel.
Stephen made a great point that a number of people jotted down. “HPC will experience every aspect of new computing technologies first. And in that respect HPC plays a key role in driving innovation into the enterprise market.”
Interestingly enough, just prior to the Newport conference, media coverage was faulting AMD for not having an articulated HPC strategy.
Many heads were nodding in agreement when Stephen spoke strongly about the need for funding and metrics in large HPC procurements. To his point that government R&D funding is critical to keeping HPC on the forefront and keeping the nation competitive, he pointed out something many people shy away from discussing. In today’s economy, it is really tough to bid aggressively on the largest research systems.
In fact, “A big win could put a company out of business” according to Wheat. “Many large-scale deals that shape the future are larger than the market cap of the players. Grabbing for the brass ring shouldn’t be a fatal activity — especially if you are successful in grabbing the ring.”
This became a real thought leadership discussion. Wheat made the point that we need a big change to procurement guidelines and the metrics for rewarding those who are willing to step up and take a risk – and normalize on total value (reliability, availability, TTS) — pointing out that successes should be rewarded more than failures are penalized.
Well stated Stephen.
Stephen and Michael also participated on Bob Feldman’s colorful panel that touched on politics, emotions, religion and gun control. OK, maybe not all those points. But in his typical entertaining style, Bob asked the panelists questions that were thought provoking and wonderfully entertaining for all of us. For example, his first question to the panel was stated as such, “It is January 2009, and we have a new president. You are asked to join in a briefing for the new president on key ways to keep American HPC competitive in order to support our economy and national security. What do you choose to tell him or her about the issues and what must be done?” Bob had everyone thinking of HPC at a completely different level.
Now that’s why I love this conference!
Most interesting factoid picked up at the 2008 Newport conference: It costs $15 million a year for the electricity to run the computing facility at ORNL.
Most interesting point someone attempted to make at the wandering hotel lounge: “All that multicore is doing is dragging us backwards. It’s inhibiting our growth and competitiveness because it’s forcing us to throw way too many resources at implementation, programming, optimization, support. Multicore will be on the tombstones of many companies!”
That was quickly followed by, “Could we have another round over here?”
Now that’s something to think about, eh?
For part 1 of this article, go here.