HPCS: The Big Picture

By Nicole Hemsoth

April 7, 2006

DARPA's High Productivity Computer Systems (HPCS) program is an ambitious attempt to propel supercomputing to the next level. In this special issue of HPCwire, each HPCS-funded vendor (Cray, IBM and Sun Microsystems) has provided us with a description of their proposed design — the three feature articles that follow this one discuss their individual approaches.

But before you delve into the details, you may want to read one man's perspective of what DARPA's HPCS program means to the high performance computing community. HPCwire recently spoke with Douglass Post, chief scientist at the DoD High Performance Computing Modernization Program, to give us his impressions of the program. The text that follows is an excerpt from a longer interview that will be featured in an upcoming issue.

HPCwire: Can you help us understand the big picture of the DARPA HPCS program?

Post: The DARPA High Productivity Computing Systems program is a program funded by the DoD Defense Advanced Research Project Agency. It has the goal of supporting industry to develop the ability to manufacture and deliver a petaflop-class computer that is substantially easier to program and use than the computers the industry is evolving toward today. The program addresses high-performance computing as an integrated activity involving high-performance computers, programmer and code developers, and production users and seeks improvements in the whole system. A large part of the growth in computer performance is being achieved by increased computer architecture complexity. This makes it very challenging to develop codes to take advantage of the increased computer power.

A goal of the HPCS program is to reduce the “time to solution” both for production runs and for code development. The HPCS program calls for the development of computer hardware that emphasizes increased computer power for both floating point and integer arithmetic, large memories, and high bandwidth (for low memory latency) and other features that improve the ability of computational scientists and engineers to develop and run codes that can fully exploit the power of supercomputing. From what I have seen, the computer vendors (IBM, Cray and Sun in Phase II) have really looked hard at what they can do to make a computer that is orders of magnitude more productive than a traditional Linux cluster. They have developed some exciting new hardware and software technologies, and I judge that an HPCS-class machine will enable computational science and engineering to address whole new classes of problems.

The emphasis on productivity is a key part of the program. The program also has a software emphasis. It's very much not the “build it and they will come” approach. There is an effort to develop benchmarks that measure the performance of computers for the applications that are important to computational scientists and engineers. There is an emphasis on developing ways to quantify productivity for code development and production. Unless we can quantify productivity, it will never be on the same footing as FLOPS/dollar in computer procurement evaluations, and we will continue to get computers that can do a great job of running Linpack, but don't do nearly as well with most real applications, and are very challenging to develop codes for and run on.

The productivity team has been doing detailed case studies of representative scientific and engineering code projects to identify the characteristics of application codes, the workflows for code development and production, “bottlenecks” and obstacles for code development and production, and “lessons learned” so that decisions by the productivity team and the vendors are based on real data rather than anecdotal data. The potential vendors are developing new computer languages and tools that improve productivity by allowing programmers to express parallelism at higher levels of abstraction. The “catch 22” issue with new languages is that no one will use the new language until it is mature, and it will never become mature unless it is used. This has led an effort to consolidate the language efforts of the vendors to produce a single new language that the community can adopt.

This summer, the program will enter Phase III when DARPA selects one or two of the Phase II vendors (Cray, IBM and Sun) for funding to be able to accept orders for a multi-petaflop computer in 2010 from prospective customers.

In the interest of full disclosure, I believe so strongly in the goals of the program that I joined the Productivity Team several years ago.

HPCwire: Do you think we need to move beyond the legacy HPC programming languages — C, Fortran, MPI — to be able to take advantage of petascale-level hardware?

Post: The language challenge is immense. MPI is a fairly low-level language, but it's reliable, predictable and works. It's also an extension of Fortran, C and C++, so developers don't have to learn another language and have minimal refactoring to do to parallelize a code. There is a tremendously large potential market for a language that enables the code developer to write parallel operations a higher level of abstraction than MPI. UPC and Co-Array Fortran are two examples that are beginning to get some acceptance. The DARPA HPCS vendors are working on three different languages (IBM-X10, Sun-Fortress, and Cray-Chapel). Other languages are also being developed by various institutions. It's going to be difficult for any of these new languages to gain acceptance. It's a chicken and egg issue, (i.e., “Which comes first, the chicken or the egg, the new language or its acceptance by the community?”). Developers of large-scale complex scientific and engineering codes only succeed if they are fanatical about risk minimization. They can't risk spending five to 10 years writing their code in a new language only to find that the new language didn't gain general acceptance, and support for the language fades. This has already happened with High-Performance Fortran, a parallel language initially released in 1993. It wasn't widely used and is no longer well-supported.

Developers of large-scale codes usually put portability near the top of their priority list. Large-scale codes often have lifetimes of 10 to 30 years. That's much longer than the three to five years between generations of high-performance computers. In addition, a successful code is expected to run on several different platforms at any one time. Languages thus must have wide acceptance. They must be long-lived and work on almost all, if not all, platforms. No one is going to base a new large project on a new language that only runs on a few platforms, isn't mature, hasn't achieved wide acceptance and support, and doesn't appear to be likely to be a community standard for the next 10 to 20 years. Recognizing this, the DARPA HPCS program has launched an effort led by Rusty Lusk at Argonne National Laboratory to develop a strategy for consolidating the parallel languages being developed by the DARPA HPCS vendors into one language. It is compatible with C, C++, Fortran and MPI, then there is a chance that when some code developers write new sections of their codes, they may write some of them using the new language. If the experience is good, the new language will slowly be adopted.

HPCwire: Why is the HPCS program a new direction in computer system development?

Post: It's the first major program in a long time to devote a significant effort to make computers more user-friendly. Indeed, one sees a growing amount of publicity for the Top500 list and Linpack as the measure of performance. This masks the difficulty of developing codes and running them for massively parallel computers, and the challenge of getting good performance for a general code that treats many effects that span many orders of magnitude in space and time. The DARPA HPCS program emphasizes productivity, reducing the challenges of using the computers and decreasing the time to solution for code development and production runs. The DARPA HPCS program emphasizes fast random memory access and low memory latency, as well as fast processing and large memory. Another key goal is the development of a quantitative measure of productivity. Until we get some way to get a reasonable metric for productivity, price/performance, based on benchmarks like Linpack, will dominate procurement decisions, and we will have lots of computers that will be a challenge to use. Fewer and fewer groups will buy the high-productivity machines because they will cost more than low-productivity machines, but that advantage won't be quantifiable.

—–

Douglass E. Post has been developing and applying large-scale multi-physics simulations for almost 35 years. He is the Chief Scientist of the DoD High Performance Computing Modernization Program and a member of the senior technical staff of the Carnegie Mellon University Software Engineering Institute. He also leads the multi-institutional DARPA High Productivity Computing Systems Existing Code Analysis team. Doug received a Ph.D. in Physics from Stanford University in 1975. He led the tokamak modeling group at Princeton University Plasma Physics Laboratory from 1975 to 1993 and served as head of International Thermonuclear Experimental Reactor (ITER) Joint Central Team Physics Project Unit (1988-1990), and head of ITER Joint  Central Team In-vessel Physics Group (1993-1998). More recently, he was the A-X Associate Division Leader for Simulation at Lawrence Livermore National Laboratory (1998-2000) and the Deputy X Division Leader for Simulation at the Los Alamos National Laboratory (2001-2002), positions that involved leadership of major portions of the US nuclear weapons simulation program. He has published over 230 refereed papers, conference papers and books in computational, experimental and theoretical physics and software engineering with over 5,000 citations. He is a Fellow of the American Physical Society, the American Nuclear Society, and the Institute of Electrical and Electronic Engineers. He serves as an Associate Editor-in-Chief of the joint AIP/IEEE publication Computing in Science and Engineering.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

OpenPOWER Reboot – New Director, New Silicon Partners, Leveraging Linux Foundation Connections

July 2, 2020

Earlier this week the OpenPOWER Foundation announced the contribution of IBM’s A21 Power processor core design to the open source community. Roughly this time last year, IBM announced open sourcing its Power instructio Read more…

By John Russell

HPC Career Notes: July 2020 Edition

July 1, 2020

In this monthly feature, we'll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it's a promotion, new company hire, or even an accolade, we've got Read more…

By Mariana Iriarte

Supercomputers Enable Radical, Promising New COVID-19 Drug Development Approach

July 1, 2020

Around the world, innumerable supercomputers are sifting through billions of molecules in a desperate search for a viable therapeutic to treat COVID-19. Those molecules are pulled from enormous databases of known compoun Read more…

By Oliver Peckham

HPC-Powered Simulations Reveal a Looming Climatic Threat to Vital Monsoon Seasons

June 30, 2020

As June draws to a close, eyes are turning to the latter half of the year – and with it, the monsoon and hurricane seasons that can prove vital or devastating for many of the world’s coastal communities. Now, climate Read more…

By Oliver Peckham

Hyperion Forecast – Headwinds in 2020 Won’t Stifle Cloud HPC Adoption or Arm’s Rise

June 30, 2020

The semiannual taking of HPC’s pulse by Hyperion Research – late fall at SC and early summer at ISC – is a much-watched indicator of things come. This year is no different though the conversion of ISC to a digital Read more…

By John Russell

AWS Solution Channel

Maxar Builds HPC on AWS to Deliver Forecasts 58% Faster Than Weather Supercomputer

When weather threatens drilling rigs, refineries, and other energy facilities, oil and gas companies want to move fast to protect personnel and equipment. And for firms that trade commodity shares in oil, precious metals, crops, and livestock, the weather can significantly impact their buy-sell decisions. Read more…

Intel® HPC + AI Pavilion

Supercomputing the Pandemic: Scientific Community Tackles COVID-19 from Multiple Perspectives

Since their inception, supercomputers have taken on the biggest, most complex, and most data-intensive computing challenges—from confirming Einstein’s theories about gravitational waves to predicting the impacts of climate change. Read more…

What’s New in HPC Research: Mosquitoes, [email protected], the Last Journey & More

June 29, 2020

In this bimonthly feature, HPCwire highlights newly published research in the high-performance computing community and related domains. From parallel programming to exascale to quantum computing, the details are here. Read more…

By Oliver Peckham

OpenPOWER Reboot – New Director, New Silicon Partners, Leveraging Linux Foundation Connections

July 2, 2020

Earlier this week the OpenPOWER Foundation announced the contribution of IBM’s A21 Power processor core design to the open source community. Roughly this time Read more…

By John Russell

Hyperion Forecast – Headwinds in 2020 Won’t Stifle Cloud HPC Adoption or Arm’s Rise

June 30, 2020

The semiannual taking of HPC’s pulse by Hyperion Research – late fall at SC and early summer at ISC – is a much-watched indicator of things come. This yea Read more…

By John Russell

Racism and HPC: a Special Podcast

June 29, 2020

Promoting greater diversity in HPC is a much-discussed goal and ostensibly a long-sought goal in HPC. Yet it seems clear HPC is far from achieving this goal. Re Read more…

Top500 Trends: Movement on Top, but Record Low Turnover

June 25, 2020

The 55th installment of the Top500 list saw strong activity in the leadership segment with four new systems in the top ten and a crowning achievement from the f Read more…

By Tiffany Trader

ISC 2020 Keynote: Hope for the Future, Praise for Fugaku and HPC’s Pandemic Response

June 24, 2020

In stark contrast to past years Thomas Sterling’s ISC20 keynote today struck a more somber note with the COVID-19 pandemic as the central character in Sterling’s annual review of worldwide trends in HPC. Better known for his engaging manner and occasional willingness to poke prickly egos, Sterling instead strode through the numbing statistics associated... Read more…

By John Russell

ISC 2020’s Student Cluster Competition Winners Announced

June 24, 2020

Normally, the Student Cluster Competition involves teams of students building real computing clusters on the show floors of major supercomputer conferences and Read more…

By Oliver Peckham

Hoefler’s Whirlwind ISC20 Virtual Tour of ML Trends in 9 Slides

June 23, 2020

The ISC20 experience this year via livestreaming and pre-recordings is interesting and perhaps a bit odd. That said presenters’ efforts to condense their comments makes for economic use of your time. Torsten Hoefler’s whirlwind 12-minute tour of ML is a great example. Hoefler, leader of the planned ISC20 Machine Learning... Read more…

By John Russell

At ISC, the Fight Against COVID-19 Took the Stage – and Yes, Fugaku Was There

June 23, 2020

With over nine million infected and nearly half a million dead, the COVID-19 pandemic has seized the world’s attention for several months. It has also dominat Read more…

By Oliver Peckham

Supercomputer Modeling Tests How COVID-19 Spreads in Grocery Stores

April 8, 2020

In the COVID-19 era, many people are treating simple activities like getting gas or groceries with caution as they try to heed social distancing mandates and protect their own health. Still, significant uncertainty surrounds the relative risk of different activities, and conflicting information is prevalent. A team of Finnish researchers set out to address some of these uncertainties by... Read more…

By Oliver Peckham

[email protected] Turns Its Massive Crowdsourced Computer Network Against COVID-19

March 16, 2020

For gamers, fighting against a global crisis is usually pure fantasy – but now, it’s looking more like a reality. As supercomputers around the world spin up Read more…

By Oliver Peckham

[email protected] Rallies a Legion of Computers Against the Coronavirus

March 24, 2020

Last week, we highlighted [email protected], a massive, crowdsourced computer network that has turned its resources against the coronavirus pandemic sweeping the globe – but [email protected] isn’t the only game in town. The internet is buzzing with crowdsourced computing... Read more…

By Oliver Peckham

Global Supercomputing Is Mobilizing Against COVID-19

March 12, 2020

Tech has been taking some heavy losses from the coronavirus pandemic. Global supply chains have been disrupted, virtually every major tech conference taking place over the next few months has been canceled... Read more…

By Oliver Peckham

Supercomputer Simulations Reveal the Fate of the Neanderthals

May 25, 2020

For hundreds of thousands of years, neanderthals roamed the planet, eventually (almost 50,000 years ago) giving way to homo sapiens, which quickly became the do Read more…

By Oliver Peckham

DoE Expands on Role of COVID-19 Supercomputing Consortium

March 25, 2020

After announcing the launch of the COVID-19 High Performance Computing Consortium on Sunday, the Department of Energy yesterday provided more details on its sco Read more…

By John Russell

Steve Scott Lays Out HPE-Cray Blended Product Roadmap

March 11, 2020

Last week, the day before the El Capitan processor disclosures were made at HPE's new headquarters in San Jose, Steve Scott (CTO for HPC & AI at HPE, and former Cray CTO) was on-hand at the Rice Oil & Gas HPC conference in Houston. He was there to discuss the HPE-Cray transition and blended roadmap, as well as his favorite topic, Cray's eighth-gen networking technology, Slingshot. Read more…

By Tiffany Trader

Honeywell’s Big Bet on Trapped Ion Quantum Computing

April 7, 2020

Honeywell doesn’t spring to mind when thinking of quantum computing pioneers, but a decade ago the high-tech conglomerate better known for its control systems waded deliberately into the then calmer quantum computing (QC) waters. Fast forward to March when Honeywell announced plans to introduce an ion trap-based quantum computer whose ‘performance’ would... Read more…

By John Russell

Leading Solution Providers

Contributors

Neocortex Will Be First-of-Its-Kind 800,000-Core AI Supercomputer

June 9, 2020

Pittsburgh Supercomputing Center (PSC - a joint research organization of Carnegie Mellon University and the University of Pittsburgh) has won a $5 million award Read more…

By Tiffany Trader

‘Billion Molecules Against COVID-19’ Challenge to Launch with Massive Supercomputing Support

April 22, 2020

Around the world, supercomputing centers have spun up and opened their doors for COVID-19 research in what may be the most unified supercomputing effort in hist Read more…

By Oliver Peckham

Australian Researchers Break All-Time Internet Speed Record

May 26, 2020

If you’ve been stuck at home for the last few months, you’ve probably become more attuned to the quality (or lack thereof) of your internet connection. Even Read more…

By Oliver Peckham

15 Slides on Programming Aurora and Exascale Systems

May 7, 2020

Sometime in 2021, Aurora, the first planned U.S. exascale system, is scheduled to be fired up at Argonne National Laboratory. Cray (now HPE) and Intel are the k Read more…

By John Russell

Nvidia’s Ampere A100 GPU: Up to 2.5X the HPC, 20X the AI

May 14, 2020

Nvidia's first Ampere-based graphics card, the A100 GPU, packs a whopping 54 billion transistors on 826mm2 of silicon, making it the world's largest seven-nanom Read more…

By Tiffany Trader

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

By Doug Black

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

TACC Supercomputers Run Simulations Illuminating COVID-19, DNA Replication

March 19, 2020

As supercomputers around the world spin up to combat the coronavirus, the Texas Advanced Computing Center (TACC) is announcing results that may help to illumina Read more…

By Staff report

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This