November 21, 2011

Blue Genes and GPU Clusters Top the Latest Green500

Michael Feldman

Despite the top 10 supercomputers in the world remaining unchanged from last June, there are some signs that supercomputers overall are getting more energy efficient. The top 10 systems on the new Green500 list average 1530.4 MFLOPS/watt while running Linpack; the 10 ten from last June averaged just 1087.0 MFLOPS/watt. That 50 percent increase in performance/watt is a little bit misleading inasmuch as the top 10 in the list are not representative of average supers.

Here are how the current top green systems stack up as of November:

1. IBM Rochester, Blue Gene/Q (2026.48 MFLOPS/W)

2. IBM Thomas J. Watson Research Center, Blue Gene/Q (2026.48 MFLOPS/W)

3. IBM Rochester, Blue Gene/Q (1996.09 MFLOPS/W)

4. DOE/NNSA/LLNL, Blue Gene/Q (1988.56 MFLOPS/W)

5. IBM Thomas J. Watson Research Center NNSA/SC, Blue Gene/Q Prototype (1689.86 MFLOPS/W)

6. Nagasaki University, DEGIMA Cluster (1378.32 MFLOPS/W)

7. Barcelona Supercomputing, Center Bullx B505 (1266.26 MFLOPS/W)

8. TGCC/GENCI, Curie Hybrid Nodes, Bullx B505 (1010.11 MFLOPS/W)

9. Chinese Academy of Sciences, Mole-8.5 Cluster (963.70 MFLOPS/W)

10. Tokyo Institute of Technology HP ProLiant SL390s G7 (958.35 MFLOPS/W)

As you can see, the top five most energy efficient supers are all Blue Gene/Q systems — some housed at IBM facilities, the others at early deployment sites at DOE labs. Blue Gene/Q was officially launched by IBM during SC11, and large deployments are on tap for Argonne National Lab (Mira, 10 petaflops) and Lawrence Livermore National Lab (Sequoia, 20 petaflops) next year.

The next five systems are all accelerated with GPUs — NVIDIA parts in four of them, with the remaining system using ATI Radeon graphics processors. All the supercomputers accelerated by IBM’s now defunct HPC Cell processor (PowerXCell 8i) are now much further down the list.

It’s notable that the BG/Q systems are about twice as efficient as the GPU-accelerated machines, such the number 10 TSUBAME system at Tokyo Tech. That’s a significant data point, given that GPU supercomputing is being promoted by NVIDIA and others GPU computing enthusiasts as an energy-efficient alternative to CPU-only systems.

Of course, Blue Gene/Q relies on a custom ASIC and interconnect, while the GPUs in these machines are based on commodity graphic processor designs and are tied together by standard InfiniBand. There are no x86-only systems that can compete with GPUs on a FLOPS/watt basis right now, but the Blue Gene/Q design certainly demonstrates what is possible for a purpose-built HPC processor and custom system network.

The other interesting top ten factoid is that all five Blue Gene/Q systems are housed in the US, while the five GPU-powered machines are deployed outside of it. That’s mostly a coincidence, but does point to the slow start of high-end GPU supercomputing in the United States, and the US-centric nature of the early BG/Q deployments. No doubt, these two architectures will show a more international mix in the months and years ahead.