The 54th Top500, revealed today at SC19, is a familiar list: the U.S. Summit (ORNL) and Sierra (LLNL) machines, offering 148.6 and 94.6 petaflops respectively, remain in first and second place. The only new entrants in the top 20 are due to attrition. Japan’s K computer was decommissioned as was ORNL’s Titan. This allowed Marconi (CINECA, Lenovo SD530/S720AP, Intel Xeon Phi 7250 68C 1.4GHz/Platinum 8160, Intel Omni-Path) and Nvidia’s DGX SuperPod to slide up two spots (into 19th and 20th place).
The highest-ranking new entrant is AiMOS at number 24, installed at Rensselaer Polytechnic Institute (RPI), in Troy, New York. AiMOS is an IBM Power9 system, powered by Nvidia V100 GPUs, of a similar ilk as Summit, Sierra, Lassen and Pangea 3, sitting at numbers one, two, ten, and 11 respectively.
If you look at the list by three dimensions – highest positioning, number of systems and aggregate flops performance – the U.S. leads in two out of three. China has the sheer number (227 versus 118 for U.S.) but as we’ve covered the majority are web-scale/cloud systems. However, the U.S. maintains its flops lead with a 37.8 percent share, a little above China’s 31.9 percent aggregate performance.
The aggregate performance of the entire list is 1.65 exaflops, and the entry point is 1.14 petaflops – versus 1.56 exaflops and 1.02 petaflops six months ago.
The last list (June 2019) debuted the first all-petascale Top500 edition. Today’s is the first to aggregate over one exaflops from 100 systems, with the “Top100” climbing to 1004 petaflops up from 983 petaflops in June (an atypically low growth rate, but sufficient to crest the e-flops threshold).
Performance share analysis for the top 100 machines looks a lot different than it does for the entire list. Removing the bottom four-fifths filters out the Web-scale machines and better reflects – Linpack shortcomings notwithstanding – the state of leadership supercomputing.
By this reckoning, the United States has the most systems: 39, that together aggregate 499 petaflops for a 49.7 percent performance share. China has the second-highest performance share, holding 17.5 percent of the list with 9 systems providing 176 petaflops. Japan’s 14 systems offer 84 petaflops, giving them an 8.5 percent performance share. Germany ties China with 9 systems, but it’s total flops are lower: 54 petaflops comes out to a 5.5 percent performance share. France operates seven systems amassing a combined aggregate 50 petaflops for a 5 percent stake.
The situation hasn’t changed much since June; the U.S. share has nudged up by .7 percent; China lost .5 percent.
However… as we reported last year, China has one – possibly two – “mystery” systems that were benchmarked for the list, but withheld for political reasons. Had things gone according to the original plan, China would capture all three target dimensions on the full Top500 and two out of three for the top 100 cohort (for which, U.S. would still claim the highest number of top 100 systems).
The Top500 BoF on Tuesday night (5:15-6:45pm, Mile High Ballroom, Colorado Convention Center) will provide a deeper dive into the latest list trends.
Perhaps the most interesting thing about this year’s Top500 is the Green500; with four new entrants at the top. The two lists became merged in 2016, and the Green500 essentially reorders the Top500 deck based on flops-per-watt efficiency.
In first place is Fujitsu’s new A64FX prototype supercomputer, sort of a mini Fugaku (Riken’s post-K machine). The second Arm supercomputer to ever be on the Top500 achieved 16.9 gigaflops-per-watt power-efficiency, and is listed at 159 in the Top500 with 2 petaflops Linpack (provided by Fujitsu’s A64FX 48-core 2GHz processors). The first Arm supercomputer to enter the Top500 list (in November 2018) was Astra. The HPE Apollo 70 with ThunderX chips, installed at Sandia, stands at 177th on the Green500, delivering 1.537 gigaflops-per-watt.
The second most energy-efficient system is machine NA-1, which achieved 16.3 gigaflops-per-watt. The Zettascaler machine relies on PEZY Computing’s PEZY-SC2 processors. Green500 reports it will be installed at NA Simulation in Japan in 2020. It was a surprise to see PEZY on the list as the Japanese company has been embroiled in controversy after the former president and general manager were charged with fraud in 2017.
Third in Green500 standings is the previously mentioned AiMOS system at RPI, followed by Satori at MIT with 15.6 gigaflops-per-watt. In fifth position, falling from second place, is the ORNL Summit machine with 14.7 gigaflops-per-watt.
We’ll be looking more closely at Top500 and Green500 results in a future article.
Here are a few additional highlights from the 54th Top500 list, announced in Denver today, shared by the Top500 group.
As a reflection of China’s dominance in sheer numbers, the top three system vendors with regard to the number of installations are Lenovo, (174), Sugon (71), and Inspur (65). Cray is number four, with 36 systems, and HPE is number five, with 35. Note that Cray is now part of HPE, so taken together they would effectively tie Sugon with 71 systems.
At the chip level, Intel continues its dominance. Its processors are present in 470 of the 500 systems, split between multiple generations of Xeon and Xeon Phi hardware. IBM is second with 14 systems – 10 with Power CPUs and four with Blue Gene/PowerPC CPUs. AMD claims just three systems on the current list.
NVIDIA is the dominant vendor for accelerators. Its GPUs are present in 136 of the 145 accelerated systems. On the previous list six months ago, there were 134 accelerated systems.
Ethernet is used in 52 percent (258) of the TOP500 systems, while InfiniBand is the network-of-choice in 28 percent (140) of systems. However, from a performance perspective, those positions are reversed, with InfiniBand-based machines representing 40 percent of the TOP500’s aggregate performance and Ethernet-based machines with 29 percent. Custom interconnects, with just 46 installations, claim 22 percent of the list’s installed performance.
The two top-ranked Summit and Sierra supercomputers on the TOP500, also remain in the top two spots on the list based on the High-Performance Conjugate Gradient (HPCG) benchmark. Summit achieved 2.93 HPCG-petaflops, with Sierra at 1.80 HPCG-petaflops. All the remaining top 10 HPCG entries, delivered less than one HPCG-petaflops. With the exception of the now-decommissioned K-computer, all 10 of these systems carried over from the previous list six months ago.