Blue Waters Opts Out of TOP500

By Tiffany Trader

November 16, 2012

The NCSA Blue Waters system is one of the fastest supercomputers in the world, but it won’t be appearing on the TOP500 list – nor will it be taking part in the HPC Challenge (HPCC) awards. While it’s generally understood that there are an unknown number of classified and commercial systems that don’t show up on the list, this is the first time an open science system has opted out in such a fashion.

According to the folks at the National Center for Supercomputing Applications (NCSA), there’s a good reason for this. In the days leading up to the 24th annual Supercomputing Conference (SC12) in Salt Lake City, HPCwire spoke with Blue Waters Project Director Bill Kramer to find out what went into this decision.

HPCwire: How long has Blue Waters been up and running? Would there have been enough time to run Linpack benchmark and submit to the TOP500 list?

Bill Kramer: Oh sure, and we would have had good results if we had chosen to run it. We even had an early science system that was a resource in the US academic world going back to January last year, and we chose not to submit that for the June list.

The system has been up and running full-scale applications in test mode and debugging and scaling platforms and so on from mid-summer on, and particularly since Linpack is such a simple test and does not require I/O, we had plenty of time to run the test.

In fact we have run the test across the entire system and the HPCC test as well, so this was a very conscious decision not to do it – it does not reflect any problems or issues.

HPCwire: Did you get the results you would have expected and are you going to release them?

Kramer: We don’t see any reason to publicize it, but there were requirements in the contract. These tests obtained very good results, but we’d rather exercise the system with real applications. For example, there are some full-scale science codes that have run over 25,000 nodes for multiple days, and they’re actually doing a science problem as opposed to a trivial problem.

We’d much rather use real applications with all the I/O and everything else in there to vet the system and accomplish a real result along the way and those are at least as stressful on the system as Linpack would be because they exercise all parts of the system not just the floating point units. Our focus is reflecting what the real scientists do not a very small subset of what some teams do.

HPCwire: So the contract with Cray did specify Linpack?

Kramer: HPCC was specified [editor’s note: HPCC includes Linpack], and that was one of hundreds of points – all of the others are much more relevant tests. For historical purposes, that was in there from the original NSF release, so we are meeting that, but it’s not relevant to whether the system is a quality system for sustained performance.

HPCwire: Are you releasing the HPCC results?

Kramer: No, and for the same reason. It’s better, but still doesn’t really reflect what to expect for real sustained performance for real applications. It’s better because it has multiple categories, but HPCC still lacks anything that has to do what to do with I/O, which is one of the major bottlenecks, so testing interconnect and testing memory performance.

Our challenge is not with Linpack as a benchmark and not with having a list, our concern is using a very simplified benchmark that has value in its own right, but not for the purpose of indicating usefulness of the system, or productivity of the system or effectiveness of the system.

HPCwire: How and when was the decision arrived at?

Kramer: Our entire project focus has been on sustained petascale performance, and it’s not one-dimensional, it’s not peak performance, it’s not Linpack performance – it’s performance for sustained real-world applications. If you go back to the original NSF solicitation, they encapsulated that into a set of six applications that they projected far forward to the challenging scientific problems that required this type of system and they set their metric to solving that problem within a certain amount of wall-clock time.

Going back to the very beginning, the philosophical nature of how this project came to be was all about delivering effective petascale computing. The investment strategy was to have a very large amount of memory, a very large amount of storage rather than trying to obtain a high single metric.

As we progressed, we have with National Science Foundation and many reviews developed a much more meaningful metric from our point of view called the Sustained Petascale Performance (SPP) test. The way we crafted that was by going to the science teams that we know and have been working with on the system and getting their real applications and their real science problems and using those as the measure of performance.

There are 12 application combinations that we are using to establish the performance of the system over a sustained petaflop in addition to the original NSF six applications. So we are actually going back to first principles: what are the scientists trying to do and making sure they’re able to do their required work within a reasonable amount of elapsed time.

The other part of this is enabling a diverse science base. The NSF, computational and data analytics community have a diverse portfolio of science, arguably the most diverse, and that diverse portfolio requires systems that perform well on that wide range of codes.

That’s really what our measures are and that’s what we remain focused on, so the decision to not list it is very consistent with what the project’s been about and what NSF’s goals have been going back to day one. The decision was made well before we needed to do any work to even submit the early system back in last January. It’s been a long–term process; it was made mutually by the university and NSF as being the right thing to do for the real goals of our project, and we’re very comfortable with it.

Next >>

HPCwire: Do you think we need a ranking system?

Kramer: I think lists are good, and I think as a focused, purposed benchmark, Linpack is good. I think the TOP500 list, though, combines those two things in a way that was interesting at some point, a while ago, but that now in some ways may be doing detriment to the community.

I have no trouble with lists and I think actually the community needs some idea of how we’re progressing, but we really need to be clear on what these lists mean, so for example, for much of the high-level systems on TOP500, what really determines how high they are is how much money is spent, not how well they perform on real applications.

There have been systems that never really get out to perform on real applications, but are on the list. There are ways to submit systems well before they are able to run many scientific or engineering applications. The historical nature of the list is perturbed by those other attributes and maybe those are what the lists measure. I can say for sure it doesn’t measure the progress in real sustained performance because there’s a severe disconnect between what the list says and what real sustained performance measures indicate.

HPCwire: Do we need something new or could we improve our current metrics to your satisfaction?

Kramer: I think there are ways to improve on relevance under the Linpack measurement. The people who put together the original list and maintain the list also talk about these things. Everybody’s afraid to take the first step. In the hallways everybody talks about the issues and the risks for misinterpretation for people who are not in our community, but then everyone says, “but I have to do it.”

Well we’re fortunate enough that we don’t have to do it, and we’re talking the first step by saying this is enough, we need to go to do something else. We are committed to working with others in the community to come up with a better way to describe how effective supercomputing is for solving unsolvable problems and that’s really the important thing.

HPCwire: If the benchmarks are very complex or we have too many of them, is that practical for a wide range of systems?

Kramer: Yes, I’m convinced it is. The NAS parallel benchmarks were very effective in their time. I’m not saying that they’re the right ones now, but in their time period, for a decade or so… There were eight tests that everybody ran. They were pseudo-applications; they didn’t have I/O in them for example, and I/O was less of a challenge in those days, but they gave you a much better picture of what you could expect out of systems.

Other benchmark suites that have between 8–12 tests are being used. The DoD has a pretty good suite that represents a reasonable workload. NERSC has a good persistence suite that has evolved over time, but I think there are enough proofs of existence that yes, you can have a much more dynamic set of things. HPCC might be a place to go leverage with those codes, but that’s also still difficult to figure how it translates into real world applications and how much you can get out of that.

If you look at the graph of real measured performance, say with the NERSC suite of codes, and look at that through 15 years of history and you look at the TOP500 lists, you see that there’s a strong disconnect between what really is achievable with systems and what the list says.

The list also correlates with the amount of funding available to pay for things. The challenges that bottleneck real performance are not being addressed. So I think yes, you can craft those processes in a tractable amount of time that is portable and expandable and that’s been done several different ways.

Next >>

HPCwire: Who are you directing this statement at? What outcome are you hoping for?

Kramer: Blue Waters is a leader in the community in many different ways, and this was another way we felt we could lead to get a more explicit dialogue going in the community about whether this is the way we want to use our metric for say exascale computing and whether this is still relevant.

HPCwire: What about push-back, both in general and your vendors, Cray and NVIDIA?

Kramer: We’ve been very clear with all of our partners and others who may have been partners, that spending tremendous effort to get a number on a list is not indicative of what’s really important to the project is not our priority so we’ve been very open with the partners and they have no objection to this.

HPCwire: In an article on the NCSA website, you write that “the TOP500 list and its associated Linpack have multiple serious problems,” and you’ve covered some of those already, would you like to highlight the ones you feel are most problematic?

Kramer: The main concerns are that it does not give an indication of value and particularly it doesn’t give an indication of value for sustained performance. Value is really the potential of a system to do work divided by its cost, so you can’t tell anything about the value; all you can tell is if you spend a lot of money on a system, you can get up high on the list.

Blue Waters is a project that is spending a significant amount of money, but it’s going into a very balanced system, not one that could have high FLOPS rates. I can tell you that if we had put all our money into peak performance and Linpack, we would have been number one on the list, for sure, for awhile.

If I had not done the investment in the world’s largest memory or the world’s most intense storage system, and just said I want to have the most number of peak FLOPS that directly translate into Linpack FLOPS that directly translates to this number and I don’t care about how hard it is for the science community to make use of those and how many science projects get disenfranchised because they’re not able to use GPUs at scale for a while, then we easily could have been on the top of the list for a number of cycles.

But that’s not our mission. It’s not what we designed our system for and it’s not what many people design their systems for. It could have led to a very poor choice for the real mission by paying attention to where the position is on the TOP500 list.

There are other aspects: the fact that you spend an awful lot of effort on getting something to work that you use once and throw away essentially all that effort. Some places have had to spend multiple weeks or months trying to get a number instead of doing science and engineering.

The improvements that we’re going to make to these SPP codes are actually improvements that go back to the science teams, so it’s a permanent improvement rather than a lot of that effort just going into a test case. It’s not a good way of allocating resources because you can’t reuse those resources.

HPCwire: Why now?

Kramer: The algorithmic space, the application space has changed dramatically from when the major implementation issues were dense linear algebra. There are many more things that are at least as important if not more important now in the way that systems are designed and what we’re trying to deal with.

Many methods have gone to sparse rather than dense, for example. As an indicator of what is really important in a system – we’re saying it’s time to relook at that and it’s not in the mission of our project to continue in that mode.

Last year at Supercomputing, there was a theme of sustained-performance and there were many parties that took part in this discussion. There were panel sessions and papers, etc. and this year, we hope we’ll be able to start the dialogue about how we do a better job of metrics that we can easily explain, but are much more much more meaningful for the real missions of our HPC systems.

Maybe by SC13 there’s a way to report back to the community – a better way that parts of the community, or hopefully the whole community, can say … after 20 years of doing it this way it’s time to do something different.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Live and in Color, Meet the European Student Cluster Teams

November 21, 2017

The SC17 Student Cluster Competition welcomed two teams from Europe, the German team of FAU/TUC and Team Poland, the pride of Warsaw. Let's get to know them better through the miracle of video..... Team FAU/TUC is a c Read more…

By Dan Olds

SC17 Student Cluster Kick Off – Guts, Glory, Grep

November 21, 2017

The SC17 Student Cluster Competition started with a well-orchestrated kick-off emceed by Stephen Harrell, the competition chair. It began with a welcome from SC17 chair Bernd Mohr, where he lauded the competition for Read more…

By Dan Olds

Activist Investor Starboard Buys 10.7% Stake in Mellanox; Sale Possible?

November 20, 2017

Starboard Value has reportedly taken a 10.7 percent stake in interconnect specialist Mellanox Technologies, and according to the Wall Street Journal, has urged the company “to improve its margins and stock and explore Read more…

By John Russell

HPE Extreme Performance Solutions

Harness Scalable Petabyte Storage with HPE Apollo 4510 and HPE StoreEver

As a growing number of connected devices challenges IT departments to rapidly collect, manage, and store troves of data, organizations must adopt a new generation of IT to help them operate quickly and intelligently. Read more…

Installation of Sierra Supercomputer Steams Along at LLNL

November 20, 2017

Sierra, the 125 petaflops (peak) machine based on IBM’s Power9 chip being built at Lawrence Livermore National Laboratory, sometimes takes a back seat to Summit, the ~200 petaflops system being built at Oak Ridge Natio Read more…

By John Russell

Live and in Color, Meet the European Student Cluster Teams

November 21, 2017

The SC17 Student Cluster Competition welcomed two teams from Europe, the German team of FAU/TUC and Team Poland, the pride of Warsaw. Let's get to know them bet Read more…

By Dan Olds

SC17 Student Cluster Kick Off – Guts, Glory, Grep

November 21, 2017

The SC17 Student Cluster Competition started with a well-orchestrated kick-off emceed by Stephen Harrell, the competition chair. It began with a welcome from Read more…

By Dan Olds

SC Bids Farewell to Denver, Heads to Dallas for 30th

November 17, 2017

After a jam-packed four-day expo and intensive six-day technical program, SC17 has wrapped up another successful event that brought together nearly 13,000 visit Read more…

By Tiffany Trader

SC17 Keynote – HPC Powers SKA Efforts to Peer Deep into the Cosmos

November 17, 2017

This week’s SC17 keynote – Life, the Universe and Computing: The Story of the SKA Telescope – was a powerful pitch for the potential of Big Science projects that also showcased the foundational role of high performance computing in modern science. It was also visually stunning. Read more…

By John Russell

How Cities Use HPC at the Edge to Get Smarter

November 17, 2017

Cities are sensoring up, collecting vast troves of data that they’re running through predictive models and using the insights to solve problems that, in some Read more…

By Doug Black

Student Cluster LINPACK Record Shattered! More LINs Packed Than Ever before!

November 16, 2017

Nanyang Technological University, the pride of Singapore, utterly destroyed the Student Cluster Competition LINPACK record by posting a score of 51.77 TFlop/s a Read more…

By Dan Olds

Hyperion Market Update: ‘Decent’ Growth Led by HPE; AI Transparency a Risk Issue

November 15, 2017

The HPC market update from Hyperion Research (formerly IDC) at the annual SC conference is a business and social “must,” and this year’s presentation at S Read more…

By Doug Black

Nvidia Focuses Its Cloud Containers on HPC Applications

November 14, 2017

Having migrated its top-of-the-line datacenter GPU to the largest cloud vendors, Nvidia is touting its Volta architecture for a range of scientific computing ta Read more…

By George Leopold

US Coalesces Plans for First Exascale Supercomputer: Aurora in 2021

September 27, 2017

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, in Arlington, Va., yesterday (Sept. 26), it was revealed that the "Aurora" supercompute Read more…

By Tiffany Trader

NERSC Scales Scientific Deep Learning to 15 Petaflops

August 28, 2017

A collaborative effort between Intel, NERSC and Stanford has delivered the first 15-petaflops deep learning software running on HPC platforms and is, according Read more…

By Rob Farber

Oracle Layoffs Reportedly Hit SPARC and Solaris Hard

September 7, 2017

Oracle’s latest layoffs have many wondering if this is the end of the line for the SPARC processor and Solaris OS development. As reported by multiple sources Read more…

By John Russell

AMD Showcases Growing Portfolio of EPYC and Radeon-based Systems at SC17

November 13, 2017

AMD’s charge back into HPC and the datacenter is on full display at SC17. Having launched the EPYC processor line in June along with its MI25 GPU the focus he Read more…

By John Russell

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Google Releases Deeplearn.js to Further Democratize Machine Learning

August 17, 2017

Spreading the use of machine learning tools is one of the goals of Google’s PAIR (People + AI Research) initiative, which was introduced in early July. Last w Read more…

By John Russell

GlobalFoundries Puts Wind in AMD’s Sails with 12nm FinFET

September 24, 2017

From its annual tech conference last week (Sept. 20), where GlobalFoundries welcomed more than 600 semiconductor professionals (reaching the Santa Clara venue Read more…

By Tiffany Trader

Amazon Debuts New AMD-based GPU Instances for Graphics Acceleration

September 12, 2017

Last week Amazon Web Services (AWS) streaming service, AppStream 2.0, introduced a new GPU instance called Graphics Design intended to accelerate graphics. The Read more…

By John Russell

Leading Solution Providers

EU Funds 20 Million Euro ARM+FPGA Exascale Project

September 7, 2017

At the Barcelona Supercomputer Centre on Wednesday (Sept. 6), 16 partners gathered to launch the EuroEXA project, which invests €20 million over three-and-a-half years into exascale-focused research and development. Led by the Horizon 2020 program, EuroEXA picks up the banner of a triad of partner projects — ExaNeSt, EcoScale and ExaNoDe — building on their work... Read more…

By Tiffany Trader

Delays, Smoke, Records & Markets – A Candid Conversation with Cray CEO Peter Ungaro

October 5, 2017

Earlier this month, Tom Tabor, publisher of HPCwire and I had a very personal conversation with Cray CEO Peter Ungaro. Cray has been on something of a Cinderell Read more…

By Tiffany Trader & Tom Tabor

Reinders: “AVX-512 May Be a Hidden Gem” in Intel Xeon Scalable Processors

June 29, 2017

Imagine if we could use vector processing on something other than just floating point problems.  Today, GPUs and CPUs work tirelessly to accelerate algorithms Read more…

By James Reinders

Cray Moves to Acquire the Seagate ClusterStor Line

July 28, 2017

This week Cray announced that it is picking up Seagate's ClusterStor HPC storage array business for an undisclosed sum. "In short we're effectively transitioning the bulk of the ClusterStor product line to Cray," said CEO Peter Ungaro. Read more…

By Tiffany Trader

Intel Launches Software Tools to Ease FPGA Programming

September 5, 2017

Field Programmable Gate Arrays (FPGAs) have a reputation for being difficult to program, requiring expertise in specialty languages, like Verilog or VHDL. Easin Read more…

By Tiffany Trader

HPC Chips – A Veritable Smorgasbord?

October 10, 2017

For the first time since AMD's ill-fated launch of Bulldozer the answer to the question, 'Which CPU will be in my next HPC system?' doesn't have to be 'Whichever variety of Intel Xeon E5 they are selling when we procure'. Read more…

By Dairsie Latimer

Flipping the Flops and Reading the Top500 Tea Leaves

November 13, 2017

The 50th edition of the Top500 list, the biannual publication of the world’s fastest supercomputers based on public Linpack benchmarking results, was released Read more…

By Tiffany Trader

IBM Advances Web-based Quantum Programming

September 5, 2017

IBM Research is pairing its Jupyter-based Data Science Experience notebook environment with its cloud-based quantum computer, IBM Q, in hopes of encouraging a new class of entrepreneurial user to solve intractable problems that even exceed the capabilities of the best AI systems. Read more…

By Alex Woodie

Share This