Blue Waters Opts Out of TOP500

By Tiffany Trader

November 16, 2012

The NCSA Blue Waters system is one of the fastest supercomputers in the world, but it won’t be appearing on the TOP500 list – nor will it be taking part in the HPC Challenge (HPCC) awards. While it’s generally understood that there are an unknown number of classified and commercial systems that don’t show up on the list, this is the first time an open science system has opted out in such a fashion.

According to the folks at the National Center for Supercomputing Applications (NCSA), there’s a good reason for this. In the days leading up to the 24th annual Supercomputing Conference (SC12) in Salt Lake City, HPCwire spoke with Blue Waters Project Director Bill Kramer to find out what went into this decision.

HPCwire: How long has Blue Waters been up and running? Would there have been enough time to run Linpack benchmark and submit to the TOP500 list?

Bill Kramer: Oh sure, and we would have had good results if we had chosen to run it. We even had an early science system that was a resource in the US academic world going back to January last year, and we chose not to submit that for the June list.

The system has been up and running full-scale applications in test mode and debugging and scaling platforms and so on from mid-summer on, and particularly since Linpack is such a simple test and does not require I/O, we had plenty of time to run the test.

In fact we have run the test across the entire system and the HPCC test as well, so this was a very conscious decision not to do it – it does not reflect any problems or issues.

HPCwire: Did you get the results you would have expected and are you going to release them?

Kramer: We don’t see any reason to publicize it, but there were requirements in the contract. These tests obtained very good results, but we’d rather exercise the system with real applications. For example, there are some full-scale science codes that have run over 25,000 nodes for multiple days, and they’re actually doing a science problem as opposed to a trivial problem.

We’d much rather use real applications with all the I/O and everything else in there to vet the system and accomplish a real result along the way and those are at least as stressful on the system as Linpack would be because they exercise all parts of the system not just the floating point units. Our focus is reflecting what the real scientists do not a very small subset of what some teams do.

HPCwire: So the contract with Cray did specify Linpack?

Kramer: HPCC was specified [editor’s note: HPCC includes Linpack], and that was one of hundreds of points – all of the others are much more relevant tests. For historical purposes, that was in there from the original NSF release, so we are meeting that, but it’s not relevant to whether the system is a quality system for sustained performance.

HPCwire: Are you releasing the HPCC results?

Kramer: No, and for the same reason. It’s better, but still doesn’t really reflect what to expect for real sustained performance for real applications. It’s better because it has multiple categories, but HPCC still lacks anything that has to do what to do with I/O, which is one of the major bottlenecks, so testing interconnect and testing memory performance.

Our challenge is not with Linpack as a benchmark and not with having a list, our concern is using a very simplified benchmark that has value in its own right, but not for the purpose of indicating usefulness of the system, or productivity of the system or effectiveness of the system.

HPCwire: How and when was the decision arrived at?

Kramer: Our entire project focus has been on sustained petascale performance, and it’s not one-dimensional, it’s not peak performance, it’s not Linpack performance – it’s performance for sustained real-world applications. If you go back to the original NSF solicitation, they encapsulated that into a set of six applications that they projected far forward to the challenging scientific problems that required this type of system and they set their metric to solving that problem within a certain amount of wall-clock time.

Going back to the very beginning, the philosophical nature of how this project came to be was all about delivering effective petascale computing. The investment strategy was to have a very large amount of memory, a very large amount of storage rather than trying to obtain a high single metric.

As we progressed, we have with National Science Foundation and many reviews developed a much more meaningful metric from our point of view called the Sustained Petascale Performance (SPP) test. The way we crafted that was by going to the science teams that we know and have been working with on the system and getting their real applications and their real science problems and using those as the measure of performance.

There are 12 application combinations that we are using to establish the performance of the system over a sustained petaflop in addition to the original NSF six applications. So we are actually going back to first principles: what are the scientists trying to do and making sure they’re able to do their required work within a reasonable amount of elapsed time.

The other part of this is enabling a diverse science base. The NSF, computational and data analytics community have a diverse portfolio of science, arguably the most diverse, and that diverse portfolio requires systems that perform well on that wide range of codes.

That’s really what our measures are and that’s what we remain focused on, so the decision to not list it is very consistent with what the project’s been about and what NSF’s goals have been going back to day one. The decision was made well before we needed to do any work to even submit the early system back in last January. It’s been a long–term process; it was made mutually by the university and NSF as being the right thing to do for the real goals of our project, and we’re very comfortable with it.

Next >>

HPCwire: Do you think we need a ranking system?

Kramer: I think lists are good, and I think as a focused, purposed benchmark, Linpack is good. I think the TOP500 list, though, combines those two things in a way that was interesting at some point, a while ago, but that now in some ways may be doing detriment to the community.

I have no trouble with lists and I think actually the community needs some idea of how we’re progressing, but we really need to be clear on what these lists mean, so for example, for much of the high-level systems on TOP500, what really determines how high they are is how much money is spent, not how well they perform on real applications.

There have been systems that never really get out to perform on real applications, but are on the list. There are ways to submit systems well before they are able to run many scientific or engineering applications. The historical nature of the list is perturbed by those other attributes and maybe those are what the lists measure. I can say for sure it doesn’t measure the progress in real sustained performance because there’s a severe disconnect between what the list says and what real sustained performance measures indicate.

HPCwire: Do we need something new or could we improve our current metrics to your satisfaction?

Kramer: I think there are ways to improve on relevance under the Linpack measurement. The people who put together the original list and maintain the list also talk about these things. Everybody’s afraid to take the first step. In the hallways everybody talks about the issues and the risks for misinterpretation for people who are not in our community, but then everyone says, “but I have to do it.”

Well we’re fortunate enough that we don’t have to do it, and we’re talking the first step by saying this is enough, we need to go to do something else. We are committed to working with others in the community to come up with a better way to describe how effective supercomputing is for solving unsolvable problems and that’s really the important thing.

HPCwire: If the benchmarks are very complex or we have too many of them, is that practical for a wide range of systems?

Kramer: Yes, I’m convinced it is. The NAS parallel benchmarks were very effective in their time. I’m not saying that they’re the right ones now, but in their time period, for a decade or so… There were eight tests that everybody ran. They were pseudo-applications; they didn’t have I/O in them for example, and I/O was less of a challenge in those days, but they gave you a much better picture of what you could expect out of systems.

Other benchmark suites that have between 8–12 tests are being used. The DoD has a pretty good suite that represents a reasonable workload. NERSC has a good persistence suite that has evolved over time, but I think there are enough proofs of existence that yes, you can have a much more dynamic set of things. HPCC might be a place to go leverage with those codes, but that’s also still difficult to figure how it translates into real world applications and how much you can get out of that.

If you look at the graph of real measured performance, say with the NERSC suite of codes, and look at that through 15 years of history and you look at the TOP500 lists, you see that there’s a strong disconnect between what really is achievable with systems and what the list says.

The list also correlates with the amount of funding available to pay for things. The challenges that bottleneck real performance are not being addressed. So I think yes, you can craft those processes in a tractable amount of time that is portable and expandable and that’s been done several different ways.

Next >>

HPCwire: Who are you directing this statement at? What outcome are you hoping for?

Kramer: Blue Waters is a leader in the community in many different ways, and this was another way we felt we could lead to get a more explicit dialogue going in the community about whether this is the way we want to use our metric for say exascale computing and whether this is still relevant.

HPCwire: What about push-back, both in general and your vendors, Cray and NVIDIA?

Kramer: We’ve been very clear with all of our partners and others who may have been partners, that spending tremendous effort to get a number on a list is not indicative of what’s really important to the project is not our priority so we’ve been very open with the partners and they have no objection to this.

HPCwire: In an article on the NCSA website, you write that “the TOP500 list and its associated Linpack have multiple serious problems,” and you’ve covered some of those already, would you like to highlight the ones you feel are most problematic?

Kramer: The main concerns are that it does not give an indication of value and particularly it doesn’t give an indication of value for sustained performance. Value is really the potential of a system to do work divided by its cost, so you can’t tell anything about the value; all you can tell is if you spend a lot of money on a system, you can get up high on the list.

Blue Waters is a project that is spending a significant amount of money, but it’s going into a very balanced system, not one that could have high FLOPS rates. I can tell you that if we had put all our money into peak performance and Linpack, we would have been number one on the list, for sure, for awhile.

If I had not done the investment in the world’s largest memory or the world’s most intense storage system, and just said I want to have the most number of peak FLOPS that directly translate into Linpack FLOPS that directly translates to this number and I don’t care about how hard it is for the science community to make use of those and how many science projects get disenfranchised because they’re not able to use GPUs at scale for a while, then we easily could have been on the top of the list for a number of cycles.

But that’s not our mission. It’s not what we designed our system for and it’s not what many people design their systems for. It could have led to a very poor choice for the real mission by paying attention to where the position is on the TOP500 list.

There are other aspects: the fact that you spend an awful lot of effort on getting something to work that you use once and throw away essentially all that effort. Some places have had to spend multiple weeks or months trying to get a number instead of doing science and engineering.

The improvements that we’re going to make to these SPP codes are actually improvements that go back to the science teams, so it’s a permanent improvement rather than a lot of that effort just going into a test case. It’s not a good way of allocating resources because you can’t reuse those resources.

HPCwire: Why now?

Kramer: The algorithmic space, the application space has changed dramatically from when the major implementation issues were dense linear algebra. There are many more things that are at least as important if not more important now in the way that systems are designed and what we’re trying to deal with.

Many methods have gone to sparse rather than dense, for example. As an indicator of what is really important in a system – we’re saying it’s time to relook at that and it’s not in the mission of our project to continue in that mode.

Last year at Supercomputing, there was a theme of sustained-performance and there were many parties that took part in this discussion. There were panel sessions and papers, etc. and this year, we hope we’ll be able to start the dialogue about how we do a better job of metrics that we can easily explain, but are much more much more meaningful for the real missions of our HPC systems.

Maybe by SC13 there’s a way to report back to the community – a better way that parts of the community, or hopefully the whole community, can say … after 20 years of doing it this way it’s time to do something different.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

With New Owner and New Roadmap, an Independent Omni-Path Is Staging a Comeback

July 23, 2021

Put on a shelf by Intel in 2019, Omni-Path faced a uncertain future, but under new custodian Cornelis Networks, OmniPath is looking to make a comeback as an independent high-performance interconnect solution. A "significant refresh" – called Omni-Path Express – is coming later this year according to the company. Cornelis Networks formed last September as a spinout of Intel's Omni-Path division. Read more…

PEARC21 Panel Reviews Eight New NSF-Funded HPC Systems Debuting in 2021

July 23, 2021

Over the past few years, the NSF has funded a number of HPC systems to further supply the open research community with computational resources to meet that community’s changing and expanding needs. A review of these systems at the PEARC21 conference (July 19-22) highlighted... Read more…

Chameleon’s HPC Testbed Sharpens Its Edge, Presses ‘Replay’

July 22, 2021

“One way of saying what I do for a living is to say that I develop scientific instruments,” said Kate Keahey, a senior fellow at the University of Chicago and a computer scientist at Argonne National Laboratory, as s Read more…

PEARC21 Plenary Session: AI for Innovative Social Work

July 21, 2021

AI analysis of social media poses a double-edged sword for social work and addressing the needs of at-risk youths, said Desmond Upton Patton, senior associate dean, Innovation and Academic Affairs, Columbia University. S Read more…

Summer Reading: “High-Performance Computing Is at an Inflection Point”

July 21, 2021

At last month’s 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART), a group of researchers led by Martin Schulz of the Leibniz Supercomputing Center (Munich) presented a “position paper” in which they argue HPC architectural landscape... Read more…

AWS Solution Channel

Accelerate innovation in healthcare and life sciences with AWS HPC

With Amazon Web Services, researchers can access purpose-built HPC tools and services along with scientific and technical expertise to accelerate the pace of discovery. Whether you are sequencing the human genome, using AI/ML for disease detection or running molecular dynamics simulations to develop lifesaving drugs, AWS has the infrastructure you need to run your HPC workloads. Read more…

PEARC21 Panel: Wafer-Scale-Engine Technology Accelerates Machine Learning, HPC

July 21, 2021

Early use of Cerebras’ CS-1 server and wafer-scale engine (WSE) has demonstrated promising acceleration of machine-learning algorithms, according to participants in the Scientific Research Enabled by CS-1 Systems panel Read more…

With New Owner and New Roadmap, an Independent Omni-Path Is Staging a Comeback

July 23, 2021

Put on a shelf by Intel in 2019, Omni-Path faced a uncertain future, but under new custodian Cornelis Networks, OmniPath is looking to make a comeback as an independent high-performance interconnect solution. A "significant refresh" – called Omni-Path Express – is coming later this year according to the company. Cornelis Networks formed last September as a spinout of Intel's Omni-Path division. Read more…

Chameleon’s HPC Testbed Sharpens Its Edge, Presses ‘Replay’

July 22, 2021

“One way of saying what I do for a living is to say that I develop scientific instruments,” said Kate Keahey, a senior fellow at the University of Chicago a Read more…

Summer Reading: “High-Performance Computing Is at an Inflection Point”

July 21, 2021

At last month’s 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART), a group of researchers led by Martin Schulz of the Leibniz Supercomputing Center (Munich) presented a “position paper” in which they argue HPC architectural landscape... Read more…

PEARC21 Panel: Wafer-Scale-Engine Technology Accelerates Machine Learning, HPC

July 21, 2021

Early use of Cerebras’ CS-1 server and wafer-scale engine (WSE) has demonstrated promising acceleration of machine-learning algorithms, according to participa Read more…

15 Years Later, the Green500 Continues Its Push for Energy Efficiency as a First-Order Concern in HPC

July 15, 2021

The Green500 list, which ranks the most energy-efficient supercomputers in the world, has virtually always faced an uphill battle. As Wu Feng – custodian of the Green500 list and an associate professor at Virginia Tech – tells it, “noone" cared about energy efficiency in the early 2000s, when the seeds... Read more…

Frontier to Meet 20MW Exascale Power Target Set by DARPA in 2008

July 14, 2021

After more than a decade of planning, the United States’ first exascale computer, Frontier, is set to arrive at Oak Ridge National Laboratory (ORNL) later this year. Crossing this “1,000x” horizon required overcoming four major challenges: power demand, reliability, extreme parallelism and data movement. Read more…

Quantum Roundup: IBM, Rigetti, Phasecraft, Oxford QC, China, and More

July 13, 2021

IBM yesterday announced a proof for a quantum ML algorithm. A week ago, it unveiled a new topology for its quantum processors. Last Friday, the Technical Univer Read more…

ExaWind Prepares for New Architectures, Bigger Simulations

July 10, 2021

The ExaWind project describes itself in terms of terms like wake formation, turbine-turbine interaction and blade-boundary-layer dynamics, but the pitch to the Read more…

AMD Chipmaker TSMC to Use AMD Chips for Chipmaking

May 8, 2021

TSMC has tapped AMD to support its major manufacturing and R&D workloads. AMD will provide its Epyc Rome 7702P CPUs – with 64 cores operating at a base cl Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

Berkeley Lab Debuts Perlmutter, World’s Fastest AI Supercomputer

May 27, 2021

A ribbon-cutting ceremony held virtually at Berkeley Lab's National Energy Research Scientific Computing Center (NERSC) today marked the official launch of Perlmutter – aka NERSC-9 – the GPU-accelerated supercomputer built by HPE in partnership with Nvidia and AMD. Read more…

Ahead of ‘Dojo,’ Tesla Reveals Its Massive Precursor Supercomputer

June 22, 2021

In spring 2019, Tesla made cryptic reference to a project called Dojo, a “super-powerful training computer” for video data processing. Then, in summer 2020, Tesla CEO Elon Musk tweeted: “Tesla is developing a [neural network] training computer called Dojo to process truly vast amounts of video data. It’s a beast! … A truly useful exaflop at de facto FP32.” Read more…

Google Launches TPU v4 AI Chips

May 20, 2021

Google CEO Sundar Pichai spoke for only one minute and 42 seconds about the company’s latest TPU v4 Tensor Processing Units during his keynote at the Google I Read more…

CentOS Replacement Rocky Linux Is Now in GA and Under Independent Control

June 21, 2021

The Rocky Enterprise Software Foundation (RESF) is announcing the general availability of Rocky Linux, release 8.4, designed as a drop-in replacement for the soon-to-be discontinued CentOS. The GA release is launching six-and-a-half months after Red Hat deprecated its support for the widely popular, free CentOS server operating system. The Rocky Linux development effort... Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

Iran Gains HPC Capabilities with Launch of ‘Simorgh’ Supercomputer

May 18, 2021

Iran is said to be developing domestic supercomputing technology to advance the processing of scientific, economic, political and military data, and to strengthen the nation’s position in the age of AI and big data. On Sunday, Iran unveiled the Simorgh supercomputer, which will deliver.... Read more…

Leading Solution Providers

Contributors

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

GTC21: Nvidia Launches cuQuantum; Dips a Toe in Quantum Computing

April 13, 2021

Yesterday Nvidia officially dipped a toe into quantum computing with the launch of cuQuantum SDK, a development platform for simulating quantum circuits on GPU-accelerated systems. As Nvidia CEO Jensen Huang emphasized in his keynote, Nvidia doesn’t plan to build... Read more…

Microsoft to Provide World’s Most Powerful Weather & Climate Supercomputer for UK’s Met Office

April 22, 2021

More than 14 months ago, the UK government announced plans to invest £1.2 billion ($1.56 billion) into weather and climate supercomputing, including procuremen Read more…

Q&A with Jim Keller, CTO of Tenstorrent, and an HPCwire Person to Watch in 2021

April 22, 2021

As part of our HPCwire Person to Watch series, we are happy to present our interview with Jim Keller, president and chief technology officer of Tenstorrent. One of the top chip architects of our time, Keller has had an impactful career. Read more…

Quantum Roundup: IBM, Rigetti, Phasecraft, Oxford QC, China, and More

July 13, 2021

IBM yesterday announced a proof for a quantum ML algorithm. A week ago, it unveiled a new topology for its quantum processors. Last Friday, the Technical Univer Read more…

Senate Debate on Bill to Remake NSF – the Endless Frontier Act – Begins

May 18, 2021

The U.S. Senate today opened floor debate on the Endless Frontier Act which seeks to remake and expand the National Science Foundation by creating a technology Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire