In the battle of the DOE labs, Oak Ridge Lab’s Titan supercomputer has taken the title from the former TOP500 champ, Lawrence Livermore’s Sequoia. The GPU-charged Titan, using the new NVIDIA K20X-equipped XK7 blades from Cray, delivered 17.6 petaflops to Sequoia’s 16.3 petaflops on Linpack, the sole metric for TOP500 rankings.
Titan looks like it will also take the energy-efficiency title from Sequoia and the Blue Gene/Q platform. The Oak Ridge super delivers 2,120 megaflops/watt, besting Sequoia’s current mark of 2,100 megaflops/watt. The results, however, won’t be official until the Green500 list is announced later this week.
Despite being knocked out the top spot, IBM machines still claim 6 of the top 10 systems:
- 17.6 petaflops, Titan (Cray), United States
- 16.3 petaflops, Sequoia (IBM), United States
- 10.5 petaflops, K computer (Fujitsu), Japan
- 8.2 petaflops, Mira (IBM), United States
- 4.1 petaflops, JUQUEEN (IBM, Germany
- 2.9 petaflops, SuperMUC (IBM), Germany
- 2.7 petaflops, Stampede (Dell), United States
- 2.6 petaflops, Tianhe-1A (NUDT), China
- 1.7 petaflops, Fermi (IBM), Italy
- 1.5 petaflops, DARPA Trial Subset (IBM), United States
Although turnover was minimal, the aggregate performance at the top is growing rapidly. These systems now represent more than 68 petaflops; a year ago those top 10 machines encompassed just over 22 petaflops.
A nice chunk of that is thanks to Titan, of course, but the ORNL super also brings a GPU-accelerated supercomputer back to the head of the list. The last time such a machine held that title was November 2010, when China’s Tianhe-1A system was the number one machine. Despite the ascendance of Titan, HPC accelerators still constitute a relatively small portion of the list — currently 62 systems.
But that’s four more than just six months ago, and with the launch of the teraflop accelerators this week from Intel (Knights Corner), NVIDIA (Kepler K20 GPUs), and AMD (FirePro S10000), those numbers will almost certainly grow. When you can buy a teraflop on a PCIe card for a few thousand dollars, it becomes a lot easier to string together a petaflop machine. While CPU-only supercomputers still have a lot of life in them, the smart money is on these vector-heavy coprocessors to expand the number of petaflop systems in the world.
Besides Titan, new to the top 10 are Dell’s Stampede and IBM’s DARPA Trial Subset machine. The Stampede machine, installed at the Texas Advanced Supercomputing Center (TACC), debuts Intel’s Knights Corner manycore accelerator, while IBM’s DARPA Trial Subset is an implementation of the Power7-based PERCS architecture, developed in conjunction with the High Productivity Computing Systems (HPCS) program. JUQUEEN is not new to the top 10, but tripled its capacity since June, moving it from number 8 to number 4.
Stampede could also make its way further up the list by next June. The TACC super is slated to reach 10 peak petaflops when the system is fully deployed in 2013, which should get the Linpack mark to about 6.7 petaflops. By then though, there is likely to be even more competition in the multi-petaflops realm.
On the interconnect front, InfiniBand-based supercomputers continue to steal share from Ethernet. Over the last six months, 15 InfiniBand systems were added, for a total of 226, while Ethernet lost 19 machines, reducing its share to 188. At the top of the list though, custom interconnects rule. On the top 10, there is but one that uses InfiniBand (Stampede); the rest employ custom interconnects of various stripes from Cray, IBM, Fujitsu, and China’s NUDT.
The one TOP500 element that remained fairly constant this time around was the geographical distribution of Linpack FLOPS. The US is still the dominant nation with 251 systems (down one from last June). China is in second place with 72 systems (down two from June). The European superpowers — UK, France and Germany have reached parity, more or less, with 24, 21, and 20 systems, respectively.
Perhaps the most significant on this latest list is the growth of petascale supercomputers, which currently constitute the top 23 systems. That’s up from the top 10 just a year ago. It’s projected that by 2015, all 500 machines will be a petaflop or greater.