Top500: Fugaku Keeps Crown, Nvidia’s Selene Climbs to #5

By Tiffany Trader

November 16, 2020

With the publication of the 56th Top500 list today from SC20’s virtual proceedings, Japan’s Fugaku supercomputer – now fully deployed – notches another win, while Nvidia’s in-house HPC-AI machine Selene, doubles in size, moving up to two spots to secure a fifth-place finish. New to the top-10 cohort are “JUWELS Booster Module” (Forschungszentrum Jülich, #7) and Dammam-7 (Saudi Aramco, #10). Nvidia captured the Green500 crown with an A100-driven DGX SuperPOD (#172 on Top500) that delivered 26.2 gigaflops-per-watt of performance.

Fugaku extends leads on HPC benchmarking (source: Satoshi Matsuoka, Top500 BoF)

RIKEN’s Fugaku supercomputer boosted its Linpack score to 442 petaflops, up from its debut listing at 415 petaflops six months ago, thanks to the addition of 6,912 nodes, bringing it to its full implementation of 158,976 A64FX CPU (1x) nodes. The Fujitsu Arm-based system also extended its performance on the new mixed precision HPL-AI benchmark to 2.0 exaflops, up from 1.4 exaflops six months ago. It grew its #1 leads on the HPCG ranking and Graph500, and maintained its standing in the upper echelon of the Green500 energy-efficiency rankings, holding onto tenth position with 14.78 gigaflops-per-watt.

Simply put, it’s another multi-category sweep for the first number-one Arm supercomputer that is the namesake of the highest mountain peak in Japan (Fugaku is another name for Mount Fuji, achieved by combining the first character of 富士, Fuji, and 岳, mountain).

Summit and Sierra (IBM/Mellanox/Nvidia, United States) remain at number two and three, respectively, and Sunway TaihuLight (China) holds steady in fourth position.

Climbing two spots to number five, an upgraded Selene supercomputer delivers 63.4 Linpack petaflops, more than doubling its previous score of 27.6 petaflops. Selene implements Nvidia’s SuperPod A100 DGX modular architecture with AMD Eypc CPUs and the new A100 80GB GPUs, which provide twice as much HBM2 memory as the original A100 40GB GPUs. In-house AI workloads, system development and testing, and chip design work are all key use cases for Selene. (Side note: Selene was previously designated as an industry system; but the Nvidia site has been brought under the vendor segment, aligning with Nvidia’s status as a system supplier.)

The Chinese-built Tianhe-2A slides one spot to number six with 61.4 petaflops. Equipped with Intel Xeon chips and custom Matrox-2000 accelerators, Tianhe-2A (aka MilkyWay-2A) entered the list in 2018 at number four. It is installed at the National Super Computer Center in Guangzhou.

New at number seven is the Atos-built JUWELS Booster Module — the most powerful in Europe with 44.1 Linpack petaflops. Powered by AMD Eypc CPUs and Nvidia GPUs and installed at the Forschungszentrum Juelich (FZJ) in Germany, the system leverages a modular system architecture. It is a companion system to the Intel Xeon-powered JUWELS Module that is at position 44 on the list — and both were integrated using the ParTec Modulo Cluster Software Suite.

Dell is still the top academic and top commercial supplier with eight and ninth place wins (HPC-5/Eni, and Frontera/TACC, respectively).

Rounding out the top ten pack at number 10 is the second newcomer system: Dammam-7. Installed at Saudi Aramco in Saudi Arabia, it’s also the second industry supercomputer in the current top 10, joining HPC5 (Eni/Dell), which is at number eight. The HPE Cray CS-Storm system uses Intel Gold Xeon CPUs and Nvidia Tesla V100 GPUs. It achieved 22.4 petaflops on the HPL benchmark.

Extending our purview to the top 50 segment, the list welcomes six additional systems: Hawk at HLRS with HPE (#16), TOKI-SORA at Japan Aerospace with Fujitsu (#19), Taranis at Meteo France with Atos (#30), Plasma Simulator at Japan’s National Institute for Fusion Science with NEC (#33),  an unnamed system at the Japan Atomic Energy Agency with HPE (#45), and Emmy+ at HLRN with Atos (#47).

The addition of #19 TOKI-SORA, a Fujitsu A64FX system that is similar in design to Fugaku, brings the total number of Arm-based machines on the list to five. Four of these were built by Fujitsu using their A64FX chips, while Astra at Sandia Labs (the world’s first petascale Arm system) was built by HPE using Marvell’s ThunderX2 processors.

The flattening trend we saw in June, propelled by COVID-19’s hampering effects, continues with the list setting another record low refresh rate with only 44 new entrants (38 systems fell off the latest list and another six were removed due to reaching end of life). Of this group of new entrants, the top 11 highest ranked are not based in the U.S. The U.S. added eight new systems, including Sandia Labs’ “SNL/NNSA CTS-1 Manzano” system (supplied by Penguin Computing with Intel CPUs and Intel Omni-Path) at #69 with 4.3 Linpack petaflops. China claimed the highest number of new systems – 13 – although all but ten of these use 10G or 25G interconnect, indicative of Web-scale, rather than true HPC, deployments. Japan put an impressive six new systems on the list, showcasing a diverse set of architectures: Fujitsu with A64FX Arm (and Tofu interconnect); Fujitsu with Intel and Nvidia chips; NEC SX Aurora Tsubasa vector engine; Dell PowerEdge with AMD Epycs, and HPE SGI with straight Intel CPUs.

The Selene SuperPOD system.

Diving into the networking makeup of the list, 157 systems use InfiniBand, inclusive of the Sunway TaihuLight system, which uses a semi-custom version of HDR InfiniBand. There are six systems with Tofu; 31 with Aries; and a handful of custom or other proprietary interconnects. Omni-Path is the interconnect technology on 47 machines, including one new system (Livermore’s Ruby supercomputer) that uses “Cornelis Networks” version. Launched in 2015, Intel’s Omni-Path Architecture (OPA) failed to find sufficient market footing and Intel pulled the plug on it in 2019. The IP was spun-out as Cornelis Networks in September of this year with the encouragement of U.S. labs. In addition to Ruby (Supermicro/Intel), Livermore’s recently-announced Mammoth Cluster (Supermicro/AMD) also uses Cornelis Omni-Path networking.

The aggregate Linpack performance provided by all 500 systems is 2.43 exaflops, up from 2.22 exaflops six months ago and 2.65 exaflops 12-months ago. The Linpack efficiency of the entire list is holding steady at 63.3 percent compared with 63.6 percent six months ago, and the Linpack efficiency of the top 100 segment is also essentially unchanged: 71.2 percent compared with 71.3 percent six months ago. The number one system, Fugaku, delivers a healthy computing efficiency of 82.28 percent, up a smidge from June’s 80.87 percent.

The minimum Linpack score required for the 56th Top500 list is 1.32 petaflops, versus 1.22 petaflops six months ago. The entry point for the top 100 segment is 3.16 petaflops versus 2.80 petaflops for the previous list. The current #500 system was ranked at #462 on the last edition.

As was the case six months ago, only two machines have crossed the 100 Linpack petaflops horizon (Fugaku and Summit). Four if you count the two (Sugon) Chinese systems that were nearly benchmarked over the last couple of years ago but not officially placed on the list (sources reported one system measured ~200 petaflops and a second reached over 300 petaflops). China has curtailed its supercomputing PR push in response to tech war tensions with the U.S. that came to a head 18-months ago.

Energy efficiency gains

Nvidia tops Green500 energy-efficiency rankings with its DGX Superpod (#172 on the Top500). Equipped with A100 GPUs, AMD Epyc Rome CPUs and HDR InfiniBand technology, it achieved 26.2 gigaflops-per-watt power-efficiency during its 2.4 Linpack petaflops performance run.

The previous Green500 leader, MN-3 from Preferred Networks, slips to second place despite improving its rating from 21.1 to 26.0 gigaflops-per-watt. Ranked 332rd on the Top500, MN-3 is powered by the MN-Core chip, a proprietary accelerator that targets matrix arithmetic.

In third place on the Green500 is the Atos-built JUWELS Booster Module installed at Forschungszentrum Jülich. The new entrant — powered by AMD Eypc Rome CPUs and Nvidia A100 GPUs with HDR InfiniBand — delivered 25.0 gigaflops-per-watt and is ranked at number seven on the Top500.

Other list highlights

This list includes 148 systems that make use of accelerator/co-processor technology, up from 146 in June. 110 have Nvidia Volta chips, 15 use Nvidia Pascal, and eight systems leverage Nvidia Kepler. There is only one entry on the list that uses AMD GPUs: a Sugon-made, Chinese system at Pukou Advanced Computing Center, powered by AMD Epyc “Naples” CPUs and AMD Vega 20 GPUs. That system, now at #291, first made its appearance one year ago.

The Top500 reports that Intel continues to provide the processors for the largest share (91.80 percent) of Top500 systems, down from 94.00 percent six months ago. AMD supplies CPUs for 21 systems (4.2 percent), up from 2 percent the previous list. AMD enjoys a higher top ranking with Selene (#5) than Intel with Tianhe-2A (#6). The top four systems on the list are not x86 based.

Performance development over time – 1993-2020 (Source: Top500)
Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Empowering High-Performance Computing for Artificial Intelligence

April 19, 2024

Artificial intelligence (AI) presents some of the most challenging demands in information technology, especially concerning computing power and data movement. As a result of these challenges, high-performance computing Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

2024 Winter Classic: Texas Two Step

April 18, 2024

Texas Tech University. Their middle name is ‘tech’, so it’s no surprise that they’ve been fielding not one, but two teams in the last three Winter Classic cluster competitions. Their teams, dubbed Matador and Red Read more…

2024 Winter Classic: The Return of Team Fayetteville

April 18, 2024

Hailing from Fayetteville, NC, Fayetteville State University stayed under the radar in their first Winter Classic competition in 2022. Solid students for sure, but not a lot of HPC experience. All good. They didn’t Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use of Rigetti’s Novera 9-qubit QPU. The approach by a quantum Read more…

2024 Winter Classic: Meet Team Morehouse

April 17, 2024

Morehouse College? The university is well-known for their long list of illustrious graduates, the rigor of their academics, and the quality of the instruction. They were one of the first schools to sign up for the Winter Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

Intel’s Xeon General Manager Talks about Server Chips 

January 2, 2024

Intel is talking data-center growth and is done digging graves for its dead enterprise products, including GPUs, storage, and networking products, which fell to Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire