HPC Power Efficiency and the Green500

By Kirk W. Cameron

November 20, 2013

The first Green500 List was launched in November 2007 ranking the energy efficiency of supercomputers. Co-founder Kirk W. Cameron discusses the events that led to creation of the Green500 List, its maturation, and future directions.

An Early Supercomputer Efficiency “List”

In 2001 the notions of Green HPC and energy proportional computing were unknown. There was no tangible evidence that power was an issue in supercomputers. Vendors simply built large systems to customer specifications. Performance kept increasing exponentially and while performance efficiency was of interest, power efficiency was not.

My early work in Green HPC was inspired by the tradeoffs inherent to power and performance. I imagined how varying power modes might make supercomputers more efficient. I speculated as to how such technologies would change the way we compute in HPC. But, in the beginning this seemed like a solution looking for a problem. No one at the time believed power was or ever would be an issue in HPC.

I needed data. And lots of it if I was to convince a community power was important. The Top500 List provided a plethora of performance data, but nothing related to power. Many of the larger supercomputer systems posted their specifications online, but the information was spotty at best and it became obvious quickly that no one was measuring power. If I wanted to improve power efficiency in supercomputers, not only would I have to prove conclusively that a power problem existed, I would have to start measuring systems myself! As a software guy, this was daunting.

cameron_fig

Figure 1. Source: NSF Career Proposal Submission, K.W. Cameron, July 2003.

It seems almost comical now, but I spent 4 months obtaining the data for Figure 1. This “list” of power consumption for the top supercomputers from 6 different Top500 lists over ten years was the first of its kind. Perhaps the most striking feature is the exponential increase in raw power consumption of the top systems from 1993 to 2003. Moreover, despite separation by a decade of technological advances, the efficiency of the TMC CM-5 (~12 MFLOPS/watt) was more than double that of the Japanese Earth simulator (5.6 MFLOPS/watt).

The trends are clear and irrefutable. Supercomputer power was a liability and would soon limit scalability. Of course, it would be almost 4 years before the community at-large began to acknowledge supercomputer power was a fundamental constraint. Let’s just say I’ve learned to be patient.

Origins of the Green500 List

Particularly in those early years, I spent a lot of time considering data collection and power measurement. My team built infrastructures and designed tools and methodologies to accurately track power usage in HPC systems. We ported our framework again and again to learn as much as we could about the tradeoffs between power and performance on emergent systems. We also built the first power-scalable HPC system prototype.

Wu Feng approached me in 2006 with the notion of creating a list of power efficient supercomputers akin to the Top500. I was already a firm believer in the need for such data having spent 4 months creating a small list of power consumption for 6 supercomputers. Furthermore, I had spent the last three years designing several generations of power measurement toolkits. My group arguably had compiled the largest, most detailed repository of HPC power data and we had a vast amount of experience measuring HPC system power.

My primary role was to design the power measurement run rules for the first list. We knew that other benchmarking methodologies had suffered when the system could be gamed easily. Based on my experience measuring power, we wrote a set of run rules describing how to easily measure a single node and extrapolate the power for a supercomputer running Linpack. The rules were designed to encourage participation by enabling non-experts to report their own power data with minimal investment in time and money. For those not reporting, we would use the UL ratings (see Figure 1) to fully populate the list.

Ease of participation was paramount. The Linpack benchmark was not ideal, but the only benchmark most supercomputer users reported regularly. MFLOPS per Watt was not an ideal metric, but it was easy to report and would encourage energy efficient, high-performance solutions.

After 6 months of discussion we solicited participation from the broader community. About a year later, in November 2007, we released the first list. The launch of the first Green500 List was an event. As if scripted, just prior to launch, the power problem in data centers had become front-page news and rather suddenly many agreed that supercomputers needed to become more energy efficient.

Some embraced the list and touted high-ranked systems while deriding low-ranked systems. Some complained of being disenfranchised. Some ridiculed our methodology and metrics. Some took issue with the lack of community involvement or coordination with other lists, benchmarks, and government agencies.

The Green500 List Matures

While most of the early dialogue and press affirmed the need for the Green500 List, some valid criticisms led to significant improvements. For example, we released an updated list in early 2008 to include measured numbers from those that did not report to the first list. In succeeding lists, we limited the amount of information we track to focus exclusively on energy efficiency. Later, we obtained research funding to explore the potential use of other benchmarks and metrics.

We’ve actively sought feedback from users as the list has matured. This has resulted in additional lists such as the Little Green500. While entry to the Green500 requires placing among the 500 fastest systems in the world, the Little Green500 broadens this definition to include systems as fast as the slowest supercomputer from the three previous Top500 lists. The goal of this list is to provide efficiency information to those that would deploy smaller systems.

While the Green500 was a bit isolated initially, it is now part of a thriving community of activists promoting energy efficiency. The Climate Savers Computing Initiative, The Green Grid, and the Energy Efficiency HPC Working Group are just a few of the proactive groups that ensure energy efficiency is now a first-class constraint in HPC design, procurement and management. For example, the Energy Efficiency HPC Working Group has been instrumental in identifying limitations in the Green500 measurement methodology. They have invested significant time and effort to isolate these limitations and suggest improvements to our methodologies that will likely be adopted in the future. They have also provided a conduit for opening discussions between the Department of Energy and vendors to establish standard practices for evaluating energy efficiency during the procurement process.

Legacy and Future of the Green500

The legacy of the Green500 is the establishment of a consistent, easy-to-follow set of power measurement run rules and the resulting data. Before the Green500 there was no widely accepted methodology for measuring supercomputer power, no way to track energy efficiency from year to year, and thus no way to encourage efficient design. The Green500 power measurement methodology has persisted nearly unchanged for almost 7 years laying the foundation for a standardized methodology for collecting supercomputer power data. The methodology can always be improved. For example, the Top500 has tweaked its run rules over the years to prevent gaming. However, the early establishment of a set of consistent, easy to follow run rules provided fairness and stability in the Green500 List’s critical infancy.

The stability of the run rules enables us to consistently analyze trends in efficiency data from year to year. These trends lead to a number of interesting observations.

I agree with Horst. Assuming its efficiency could be maintained, the TMC CM-5 system from 1993 would have landed in position #493 on the inaugural November 2007 Green500 List. This position is ahead of both the Earth Simulator (#497) and ASCI Q (#500). From 1993 to 2007 the MFLOPS/watt of the fastest systems went from 12 to 357. From 2007 to 2013 the MFLOPS/watt of the fastest systems went from 357 to 3208.

An exascale system in 20 MW will require 50,000 MFLOPS/watt. If efficiency trends continue as they did from 1993 to 2007, a 20MW exascale system is achievable in about 22 years (2035). The last 6 years saw tremendous efficiency improvements using accelerators. Assuming another efficiency boost from new technology equivalent to the gain from accelerators, an exascale system is achievable in 20 MW in about 9 years (2022). Most likely, we will see moderate gains placing us at exascale in 20 MW by about 2025. This is well beyond the goal of exascale by 2020 in 20MW.

The shell game. While the Green500 gives us loads of information we never had before, there is little information about the power budget of the components of a system. While knowing total power is helpful, knowing how the power is spent across the system is critical to acquisition decisions. Is the majority of the power budget used on the GPUs, the memory, the CPUs, the disks, the network? Most systems in the Green500 are designed from commodity parts assembled at scale. If we truly want to promote efficiency and enable people to make informed design decisions, we need more insight to the details of where power is spent in these larger systems. Is a system with lots of disk arrays more or less of a power hog than a system with lots of GPUs? I really have no idea. And I’ve been studying power for more than a decade.

Will HPC ever embrace power management? The benefit of power management is clear. Save energy. Work abounds showing energy savings can be achieved with little to no performance loss. Nonetheless, most supercomputers disable all power management. On the flip side, power management technologies such as Intel Turbo boost can increase performance maximally within thermal limits. In fact, the SuperMuc supercomputer in Munich, Germany was chastised by some in the community for enabling Turbo boost during their early benchmarking and thus potentially skewing their Linpack results.

Trying to adapt benchmarking methodologies to mitigate against gaming is welcome. Trying to adapt benchmarking methodologies to neutralize the effects of technologies that improve efficiency is counterproductive and I believe ultimately futile. Systems are gaining in complexity every day. They are larger, have more parts and parallelism, and more autonomy in every generation. Processors throttle themselves, and memories and GPUs will soon do the same. Power and performance will not be fixed between two successive runs in these types of dynamic, complex systems. We must develop evaluation methodologies that embrace complexity and non-determinism since they will eventually transcend our ability to adapt. Furthermore, in the long run, the complexity and non-determinism we are attempting to ignore will be essential to maximize performance. Only when we accept complexity and non-determinism as constants can we adopt power management in production systems.

The Future. Accelerators are here to say, but most computational scientists I know refuse to use them. I’m not sure which group will blink first, the hardware designers or the users. Perhaps the middleware folks will come to the rescue and make accelerators more programmable. In any case, I think we’ll see accelerators dominate the Green500 List until they are replaced by a new technology or abandoned by all.

In every talk I’ve seen by Intel and NVidia, the consensus seems to be we are still really in the first generation of accelerators with several significantly advanced generations to come. These next generations are faster, have more parallelism, more on-board memory, more power management, and are more tightly integrated with the board. This means above all more complexity. These systems will be even harder to program and evaluate. They will likely show modest efficiency gains in the Green500, but they will not match the percentage gains from the first generation placing exascale beyond the 2020 goal.

W

While we co-founders have provided a consistent vision, biannual installments of the Green500 List are the work of an army of dedicated students, researchers, and passionate crusaders for energy efficiency. Without selfless adoption by a much broader community, the Green500 List would have been a fleeting anecdote.

It’s been more than twelve years since I started down the Green HPC path. I honestly thought after four to five years we would have exhausted all the interesting problems in HPC efficiency. The Green500 List’s impact has greatly exceeded my expectations. The introduction of a stable and fair methodology to track efficiency has withstood nearly 7 years of scrutiny and highlighted the insatiable need for ongoing research. What I failed to appreciate in the beginning was that power efficiency as a problem would transform and perpetuate with every new generation of supercomputer. Like the challenges of performance, reliability, and security, power efficiency is here to stay.

About the Author

Kirk W. Cameron is a Professor of Computer Science and a Faculty Fellow in the College of Engineering at Virginia Tech. Prof. Cameron is a pioneer and leading expert in Green Computing. Cameron is the Green IT columnist for IEEE Computer, Green500 co-founder, founding member of SPECPower, EPA consultant, Uptime Institute Fellow, and co-founder of power management software startup company MiserWare. His power measurement and management software tools are used by nearly half a million people in more than 160 countries. Accolades for his work include NSF and DOE Career Awards, the IBM Faculty Award, and being named Innovator of the Week by Bloomberg Businessweek Magazine. Prof. Cameron received the Ph.D. in Computer Science from Louisiana State University (2000) and B.S. in Mathematics from the University of Florida (1994).

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

HPC-as-a-Service Finds Toehold in Iceland

December 11, 2017

While high-demand workloads (e.g., bitcoin mining) can overheat data center cooling capabilities, at least one data center infrastructure provider has announced an HPC-as-a-service offering that features 100 percent fre Read more…

By Doug Black

HPC Iron, Soft, Data, People – It Takes an Ecosystem!

December 11, 2017

Cutting edge advanced computing hardware (aka big iron) does not stand by itself. These computers are the pinnacle of a myriad of technologies that must be carefully woven together by people to create the computational c Read more…

By Alex R. Larzelere

IBM Begins Power9 Rollout with Backing from DOE, Google

December 6, 2017

After over a year of buildup, IBM is unveiling its first Power9 system based on the same architecture as the Department of Energy CORAL supercomputers, Summit and Sierra. The new AC922 server pairs two Power9 CPUs with f Read more…

By Tiffany Trader

HPE Extreme Performance Solutions

Explore the Origins of Space with COSMOS and Memory-Driven Computing

From the formation of black holes to the origins of space, data is the key to unlocking the secrets of the early universe. Read more…

PEZY President Arrested, Charged with Fraud

December 6, 2017

The head of Japanese supercomputing firm PEZY Computing was arrested Tuesday on suspicion of defrauding a government institution of 431 million yen (~$3.8 million). According to reports in the Japanese press, PEZY founde Read more…

By Tiffany Trader

HPC Iron, Soft, Data, People – It Takes an Ecosystem!

December 11, 2017

Cutting edge advanced computing hardware (aka big iron) does not stand by itself. These computers are the pinnacle of a myriad of technologies that must be care Read more…

By Alex R. Larzelere

IBM Begins Power9 Rollout with Backing from DOE, Google

December 6, 2017

After over a year of buildup, IBM is unveiling its first Power9 system based on the same architecture as the Department of Energy CORAL supercomputers, Summit a Read more…

By Tiffany Trader

Microsoft Spins Cycle Computing into Core Azure Product

December 5, 2017

Last August, cloud giant Microsoft acquired HPC cloud orchestration pioneer Cycle Computing. Since then the focus has been on integrating Cycle’s organization Read more…

By John Russell

GlobalFoundries, Ayar Labs Team Up to Commercialize Optical I/O

December 4, 2017

GlobalFoundries (GF) and Ayar Labs, a startup focused on using light, instead of electricity, to transfer data between chips, today announced they've entered in Read more…

By Tiffany Trader

HPE In-Memory Platform Comes to COSMOS

November 30, 2017

Hewlett Packard Enterprise is on a mission to accelerate space research. In August, it sent the first commercial-off-the-shelf HPC system into space for testing Read more…

By Tiffany Trader

SC17 Cluster Competition: Who Won and Why? Results Analyzed and Over-Analyzed

November 28, 2017

Everyone by now knows that Nanyang Technological University of Singapore (NTU) took home the highest LINPACK Award and the Overall Championship from the recently concluded SC17 Student Cluster Competition. We also already know how the teams did in the Highest LINPACK and Highest HPCG competitions, with Nanyang grabbing bragging rights for both benchmarks. Read more…

By Dan Olds

Perspective: What Really Happened at SC17?

November 22, 2017

SC is over. Now comes the myriad of follow-ups. Inboxes are filled with templated emails from vendors and other exhibitors hoping to win a place in the post-SC thinking of booth visitors. Attendees of tutorials, workshops and other technical sessions will be inundated with requests for feedback. Read more…

By Andrew Jones

SC Bids Farewell to Denver, Heads to Dallas for 30th Anniversary

November 17, 2017

After a jam-packed four-day expo and intensive six-day technical program, SC17 has wrapped up another successful event that brought together nearly 13,000 visit Read more…

By Tiffany Trader

US Coalesces Plans for First Exascale Supercomputer: Aurora in 2021

September 27, 2017

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, in Arlington, Va., yesterday (Sept. 26), it was revealed that the "Aurora" supercompute Read more…

By Tiffany Trader

NERSC Scales Scientific Deep Learning to 15 Petaflops

August 28, 2017

A collaborative effort between Intel, NERSC and Stanford has delivered the first 15-petaflops deep learning software running on HPC platforms and is, according Read more…

By Rob Farber

Oracle Layoffs Reportedly Hit SPARC and Solaris Hard

September 7, 2017

Oracle’s latest layoffs have many wondering if this is the end of the line for the SPARC processor and Solaris OS development. As reported by multiple sources Read more…

By John Russell

AMD Showcases Growing Portfolio of EPYC and Radeon-based Systems at SC17

November 13, 2017

AMD’s charge back into HPC and the datacenter is on full display at SC17. Having launched the EPYC processor line in June along with its MI25 GPU the focus he Read more…

By John Russell

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Japan Unveils Quantum Neural Network

November 22, 2017

The U.S. and China are leading the race toward productive quantum computing, but it's early enough that ultimate leadership is still something of an open questi Read more…

By Tiffany Trader

GlobalFoundries Puts Wind in AMD’s Sails with 12nm FinFET

September 24, 2017

From its annual tech conference last week (Sept. 20), where GlobalFoundries welcomed more than 600 semiconductor professionals (reaching the Santa Clara venue Read more…

By Tiffany Trader

Google Releases Deeplearn.js to Further Democratize Machine Learning

August 17, 2017

Spreading the use of machine learning tools is one of the goals of Google’s PAIR (People + AI Research) initiative, which was introduced in early July. Last w Read more…

By John Russell

Leading Solution Providers

Amazon Debuts New AMD-based GPU Instances for Graphics Acceleration

September 12, 2017

Last week Amazon Web Services (AWS) streaming service, AppStream 2.0, introduced a new GPU instance called Graphics Design intended to accelerate graphics. The Read more…

By John Russell

Perspective: What Really Happened at SC17?

November 22, 2017

SC is over. Now comes the myriad of follow-ups. Inboxes are filled with templated emails from vendors and other exhibitors hoping to win a place in the post-SC thinking of booth visitors. Attendees of tutorials, workshops and other technical sessions will be inundated with requests for feedback. Read more…

By Andrew Jones

EU Funds 20 Million Euro ARM+FPGA Exascale Project

September 7, 2017

At the Barcelona Supercomputer Centre on Wednesday (Sept. 6), 16 partners gathered to launch the EuroEXA project, which invests €20 million over three-and-a-half years into exascale-focused research and development. Led by the Horizon 2020 program, EuroEXA picks up the banner of a triad of partner projects — ExaNeSt, EcoScale and ExaNoDe — building on their work... Read more…

By Tiffany Trader

Delays, Smoke, Records & Markets – A Candid Conversation with Cray CEO Peter Ungaro

October 5, 2017

Earlier this month, Tom Tabor, publisher of HPCwire and I had a very personal conversation with Cray CEO Peter Ungaro. Cray has been on something of a Cinderell Read more…

By Tiffany Trader & Tom Tabor

Tensors Come of Age: Why the AI Revolution Will Help HPC

November 13, 2017

Thirty years ago, parallel computing was coming of age. A bitter battle began between stalwart vector computing supporters and advocates of various approaches to parallel computing. IBM skeptic Alan Karp, reacting to announcements of nCUBE’s 1024-microprocessor system and Thinking Machines’ 65,536-element array, made a public $100 wager that no one could get a parallel speedup of over 200 on real HPC workloads. Read more…

By John Gustafson & Lenore Mullin

Flipping the Flops and Reading the Top500 Tea Leaves

November 13, 2017

The 50th edition of the Top500 list, the biannual publication of the world’s fastest supercomputers based on public Linpack benchmarking results, was released Read more…

By Tiffany Trader

Intel Launches Software Tools to Ease FPGA Programming

September 5, 2017

Field Programmable Gate Arrays (FPGAs) have a reputation for being difficult to program, requiring expertise in specialty languages, like Verilog or VHDL. Easin Read more…

By Tiffany Trader

HPC Chips – A Veritable Smorgasbord?

October 10, 2017

For the first time since AMD's ill-fated launch of Bulldozer the answer to the question, 'Which CPU will be in my next HPC system?' doesn't have to be 'Whichever variety of Intel Xeon E5 they are selling when we procure'. Read more…

By Dairsie Latimer

Share This