HPC Power Efficiency and the Green500
The first Green500 List was launched in November 2007 ranking the energy efficiency of supercomputers. Co-founder Kirk W. Cameron discusses the events that led to creation of the Green500 List, its maturation, and future directions.
An Early Supercomputer Efficiency “List”
In 2001 the notions of Green HPC and energy proportional computing were unknown. There was no tangible evidence that power was an issue in supercomputers. Vendors simply built large systems to customer specifications. Performance kept increasing exponentially and while performance efficiency was of interest, power efficiency was not.
My early work in Green HPC was inspired by the tradeoffs inherent to power and performance. I imagined how varying power modes might make supercomputers more efficient. I speculated as to how such technologies would change the way we compute in HPC. But, in the beginning this seemed like a solution looking for a problem. No one at the time believed power was or ever would be an issue in HPC.
I needed data. And lots of it if I was to convince a community power was important. The Top500 List provided a plethora of performance data, but nothing related to power. Many of the larger supercomputer systems posted their specifications online, but the information was spotty at best and it became obvious quickly that no one was measuring power. If I wanted to improve power efficiency in supercomputers, not only would I have to prove conclusively that a power problem existed, I would have to start measuring systems myself! As a software guy, this was daunting.
Figure 1. Source: NSF Career Proposal Submission, K.W. Cameron, July 2003.
It seems almost comical now, but I spent 4 months obtaining the data for Figure 1. This “list” of power consumption for the top supercomputers from 6 different Top500 lists over ten years was the first of its kind. Perhaps the most striking feature is the exponential increase in raw power consumption of the top systems from 1993 to 2003. Moreover, despite separation by a decade of technological advances, the efficiency of the TMC CM-5 (~12 MFLOPS/watt) was more than double that of the Japanese Earth simulator (5.6 MFLOPS/watt).
The trends are clear and irrefutable. Supercomputer power was a liability and would soon limit scalability. Of course, it would be almost 4 years before the community at-large began to acknowledge supercomputer power was a fundamental constraint. Let’s just say I’ve learned to be patient.
Origins of the Green500 List
Particularly in those early years, I spent a lot of time considering data collection and power measurement. My team built infrastructures and designed tools and methodologies to accurately track power usage in HPC systems. We ported our framework again and again to learn as much as we could about the tradeoffs between power and performance on emergent systems. We also built the first power-scalable HPC system prototype.
Wu Feng approached me in 2006 with the notion of creating a list of power efficient supercomputers akin to the Top500. I was already a firm believer in the need for such data having spent 4 months creating a small list of power consumption for 6 supercomputers. Furthermore, I had spent the last three years designing several generations of power measurement toolkits. My group arguably had compiled the largest, most detailed repository of HPC power data and we had a vast amount of experience measuring HPC system power.
My primary role was to design the power measurement run rules for the first list. We knew that other benchmarking methodologies had suffered when the system could be gamed easily. Based on my experience measuring power, we wrote a set of run rules describing how to easily measure a single node and extrapolate the power for a supercomputer running Linpack. The rules were designed to encourage participation by enabling non-experts to report their own power data with minimal investment in time and money. For those not reporting, we would use the UL ratings (see Figure 1) to fully populate the list.
Ease of participation was paramount. The Linpack benchmark was not ideal, but the only benchmark most supercomputer users reported regularly. MFLOPS per Watt was not an ideal metric, but it was easy to report and would encourage energy efficient, high-performance solutions.
After 6 months of discussion we solicited participation from the broader community. About a year later, in November 2007, we released the first list. The launch of the first Green500 List was an event. As if scripted, just prior to launch, the power problem in data centers had become front-page news and rather suddenly many agreed that supercomputers needed to become more energy efficient.
Some embraced the list and touted high-ranked systems while deriding low-ranked systems. Some complained of being disenfranchised. Some ridiculed our methodology and metrics. Some took issue with the lack of community involvement or coordination with other lists, benchmarks, and government agencies.
The Green500 List Matures
While most of the early dialogue and press affirmed the need for the Green500 List, some valid criticisms led to significant improvements. For example, we released an updated list in early 2008 to include measured numbers from those that did not report to the first list. In succeeding lists, we limited the amount of information we track to focus exclusively on energy efficiency. Later, we obtained research funding to explore the potential use of other benchmarks and metrics.
We’ve actively sought feedback from users as the list has matured. This has resulted in additional lists such as the Little Green500. While entry to the Green500 requires placing among the 500 fastest systems in the world, the Little Green500 broadens this definition to include systems as fast as the slowest supercomputer from the three previous Top500 lists. The goal of this list is to provide efficiency information to those that would deploy smaller systems.
While the Green500 was a bit isolated initially, it is now part of a thriving community of activists promoting energy efficiency. The Climate Savers Computing Initiative, The Green Grid, and the Energy Efficiency HPC Working Group are just a few of the proactive groups that ensure energy efficiency is now a first-class constraint in HPC design, procurement and management. For example, the Energy Efficiency HPC Working Group has been instrumental in identifying limitations in the Green500 measurement methodology. They have invested significant time and effort to isolate these limitations and suggest improvements to our methodologies that will likely be adopted in the future. They have also provided a conduit for opening discussions between the Department of Energy and vendors to establish standard practices for evaluating energy efficiency during the procurement process.
Legacy and Future of the Green500
The legacy of the Green500 is the establishment of a consistent, easy-to-follow set of power measurement run rules and the resulting data. Before the Green500 there was no widely accepted methodology for measuring supercomputer power, no way to track energy efficiency from year to year, and thus no way to encourage efficient design. The Green500 power measurement methodology has persisted nearly unchanged for almost 7 years laying the foundation for a standardized methodology for collecting supercomputer power data. The methodology can always be improved. For example, the Top500 has tweaked its run rules over the years to prevent gaming. However, the early establishment of a set of consistent, easy to follow run rules provided fairness and stability in the Green500 List’s critical infancy.
The stability of the run rules enables us to consistently analyze trends in efficiency data from year to year. These trends lead to a number of interesting observations.
I agree with Horst. Assuming its efficiency could be maintained, the TMC CM-5 system from 1993 would have landed in position #493 on the inaugural November 2007 Green500 List. This position is ahead of both the Earth Simulator (#497) and ASCI Q (#500). From 1993 to 2007 the MFLOPS/watt of the fastest systems went from 12 to 357. From 2007 to 2013 the MFLOPS/watt of the fastest systems went from 357 to 3208.
An exascale system in 20 MW will require 50,000 MFLOPS/watt. If efficiency trends continue as they did from 1993 to 2007, a 20MW exascale system is achievable in about 22 years (2035). The last 6 years saw tremendous efficiency improvements using accelerators. Assuming another efficiency boost from new technology equivalent to the gain from accelerators, an exascale system is achievable in 20 MW in about 9 years (2022). Most likely, we will see moderate gains placing us at exascale in 20 MW by about 2025. This is well beyond the goal of exascale by 2020 in 20MW.
The shell game. While the Green500 gives us loads of information we never had before, there is little information about the power budget of the components of a system. While knowing total power is helpful, knowing how the power is spent across the system is critical to acquisition decisions. Is the majority of the power budget used on the GPUs, the memory, the CPUs, the disks, the network? Most systems in the Green500 are designed from commodity parts assembled at scale. If we truly want to promote efficiency and enable people to make informed design decisions, we need more insight to the details of where power is spent in these larger systems. Is a system with lots of disk arrays more or less of a power hog than a system with lots of GPUs? I really have no idea. And I’ve been studying power for more than a decade.
Will HPC ever embrace power management? The benefit of power management is clear. Save energy. Work abounds showing energy savings can be achieved with little to no performance loss. Nonetheless, most supercomputers disable all power management. On the flip side, power management technologies such as Intel Turbo boost can increase performance maximally within thermal limits. In fact, the SuperMuc supercomputer in Munich, Germany was chastised by some in the community for enabling Turbo boost during their early benchmarking and thus potentially skewing their Linpack results.
Trying to adapt benchmarking methodologies to mitigate against gaming is welcome. Trying to adapt benchmarking methodologies to neutralize the effects of technologies that improve efficiency is counterproductive and I believe ultimately futile. Systems are gaining in complexity every day. They are larger, have more parts and parallelism, and more autonomy in every generation. Processors throttle themselves, and memories and GPUs will soon do the same. Power and performance will not be fixed between two successive runs in these types of dynamic, complex systems. We must develop evaluation methodologies that embrace complexity and non-determinism since they will eventually transcend our ability to adapt. Furthermore, in the long run, the complexity and non-determinism we are attempting to ignore will be essential to maximize performance. Only when we accept complexity and non-determinism as constants can we adopt power management in production systems.
The Future. Accelerators are here to say, but most computational scientists I know refuse to use them. I’m not sure which group will blink first, the hardware designers or the users. Perhaps the middleware folks will come to the rescue and make accelerators more programmable. In any case, I think we’ll see accelerators dominate the Green500 List until they are replaced by a new technology or abandoned by all.
In every talk I’ve seen by Intel and NVidia, the consensus seems to be we are still really in the first generation of accelerators with several significantly advanced generations to come. These next generations are faster, have more parallelism, more on-board memory, more power management, and are more tightly integrated with the board. This means above all more complexity. These systems will be even harder to program and evaluate. They will likely show modest efficiency gains in the Green500, but they will not match the percentage gains from the first generation placing exascale beyond the 2020 goal.
While we co-founders have provided a consistent vision, biannual installments of the Green500 List are the work of an army of dedicated students, researchers, and passionate crusaders for energy efficiency. Without selfless adoption by a much broader community, the Green500 List would have been a fleeting anecdote.
It’s been more than twelve years since I started down the Green HPC path. I honestly thought after four to five years we would have exhausted all the interesting problems in HPC efficiency. The Green500 List’s impact has greatly exceeded my expectations. The introduction of a stable and fair methodology to track efficiency has withstood nearly 7 years of scrutiny and highlighted the insatiable need for ongoing research. What I failed to appreciate in the beginning was that power efficiency as a problem would transform and perpetuate with every new generation of supercomputer. Like the challenges of performance, reliability, and security, power efficiency is here to stay.
About the Author
Kirk W. Cameron is a Professor of Computer Science and a Faculty Fellow in the College of Engineering at Virginia Tech. Prof. Cameron is a pioneer and leading expert in Green Computing. Cameron is the Green IT columnist for IEEE Computer, Green500 co-founder, founding member of SPECPower, EPA consultant, Uptime Institute Fellow, and co-founder of power management software startup company MiserWare. His power measurement and management software tools are used by nearly half a million people in more than 160 countries. Accolades for his work include NSF and DOE Career Awards, the IBM Faculty Award, and being named Innovator of the Week by Bloomberg Businessweek Magazine. Prof. Cameron received the Ph.D. in Computer Science from Louisiana State University (2000) and B.S. in Mathematics from the University of Florida (1994).