Super-Connecting the Supercomputers – Innovations Through Network Topologies

By Gilad Shainer, Mellanox Technologies

July 15, 2019

In the article “Super-Connecting the Supercomputers” published on June 10, 2019 in HPCwire, we discussed the different interconnect pillars, namely the connectivity pillar, the network pillar and the communication pillar. The ‘connectivity pillar’ refers to the elements around the interconnect infrastructure, such as network topologies. The ‘network pillar’ refers to the network transport and routing for example. And the ‘communication pillar’ refers to co-design elements related to communication frameworks, such as MPI, SHMEM/PGAS and more. This article focuses on the first pillar, and in particular, on the network topologies.

It may be one of the great secrets, that supercomputing innovations actually begin in the structure of the supercomputer; that is, in the way we connect the compute elements together. There are many network topology options, and InfiniBand, as it is specified and designed as the ultimate software-defined network, can support any thinkable option.

Figure 1 – Network Topologies

Fat-Tree (folded CLOS) is one of the most widely used topologies. It is a good option for a variety of applications as it provides low latency and enables a variety of throughput options – from non-blocking connectivity to oversubscriptions. This topology type maximizes data throughput for a variety of traffic patterns; however, it is relatively costly at large scale due to the large number of switches and links it requires. Torus topologies directly interconnect a host to several of its neighbors in a k-dimensional lattice. Tori topologies are inexpensive but provide low network throughput for adversary traffic patterns. A torus is a great topology for stencil applications, such as lattice QCD applications, but due to its blocking nature and higher latency, it is not a preferred option for supercomputers that need to support a variety of applications.  Examples of other options used today or being developed for future use are Hypercube and HyperX.

The Dragonfly topology was introduced by Kim John et al. and is described in the paper entitled “Technology-driven, highly-scalable dragonfly topology.” Dragonfly provides good performance for a variety of applications (or communication patterns), like Fat-Tree; specifically, it reduces network costs compared to other topologies, by reducing the number of long links.

As seen in Figure 2, Dragonfly is based on groups of connected compute elements, where all the groups are connected in a full graph. One can create any inner-group structure, such as a full graph (Dragonfly), a generalized hypercube (GHC), or a Fat-Tree, as seen in Figure 3.

Figure 2 – Dragonfly Topology

 

Figure 3 – Dragonfly Group Options

The full graph option has been used in the traditional Dragonfly topology deployed with proprietary networks over the years. The Fat-Tree option is being used by the new innovative Dragonfly+ (DF+) topology, supported by InfiniBand. Compared to the traditional Dragonfly, Dragonfly+ is more scalable since it allows connecting larger number of hosts to the network (when comparing the same switch radix), it provides better-known worst-case throughput for the same number of global inter-group links, and it enables better switch buffer utilization.

Multiple papers such as “Performance implications of remote-only load balancing under adversarial traffic in Dragonflies,” by Bogdan Prisacari, German Rodriguez, Marina Garcia, Cyriel Minkenberg (IBM Research – Zurich), and  Enrique Vallejo, Ramon Beivide (University of Cantabria, Spain); or “Modeling UGAL on the Dragonfly Topology,” by Scott Pakin, Michael Lang (Los Alamos National Laboratory) and Atiqul Mollah, Peyman Faizian, Shafayat Rahman, Xin Yuan (Florida State University), indicated several of the traditional Dragonfly performance limitations, such as performance degradation of adversarial traffic, and how network bandwidth can be negatively impacted when using higher switch radix.

On the other hand, the innovative Dragonfly+ supports multiple routes from ingress switch to egress switch, and, therefore, delivers the highest data throughput (without any dependency on the switch radix) due to the Fat Tree topology within the group. This delivers a superior option over the traditional Dragonfly topology for large-scale supercomputing platforms.

The University of Toronto was the first to deploy a large-scale InfiniBand Dragonfly+ supercomputer, which has been in production for nearly 1.5 years now. The Niagara supercomputer appearing in Figure 4, is Canada’s most powerful research supercomputer.

Figure 4 – The Niagara Supercomputer and the Dragonfly+ Topology

Another advantage of Dragonfly+ is the ability to scale the cluster overtime without re-cabling any of the long cables, allowing the addition of new groups, whether compute or storage. This is an advantage which neither the Fat-Tree nor the traditional Dragonfly topologies support, that provides great benefit for multi-phase supercomputers, and supports growing computer or storage demands over time.

The advantages of Dragonfly+ makes it the preferred topology for the new generation of large-scale supercomputing. For example, CSC, the Finnish IT Center for Science, a national HPC center providing supercomputing and networking services for Finnish academia, research institutes, the public sector and industry, have selected the Dragonfly+ InfiniBand topology for its next-generation supercomputer. We expect to hear more announcements of new large-scale supercomputers around the world adopting the innovative Dragonfly+ topology. For more information, contact [email protected].

 

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Argonne’s Rick Stevens on Energy, AI, and a New Kind of Science

June 17, 2024

The world is currently experiencing two of the largest societal upheavals since the beginning of the Industrial Revolution. One is the rapid improvement and implementation of artificial intelligence (AI) tools, while the Read more…

Under The Wire: Nearly HPC News (June 13, 2024)

June 13, 2024

As managing editor of the major global HPC news source, the term "news fire hose" is often mentioned. The analogy is quite correct. In any given week, there are many interesting stories, and only a few ever become headli Read more…

Quantum Tech Sector Hiring Stays Soft

June 13, 2024

New job announcements in the quantum tech sector declined again last month, according to an Quantum Economic Development Consortium (QED-C) report issued last week. “Globally, the number of new, public postings for Qu Read more…

Labs Keep Supercomputers Alive for Ten Years as Vendors Pull Support Early

June 12, 2024

Laboratories are running supercomputers for much longer, beyond the typical lifespan, as vendors prematurely deprecate the hardware and stop providing support. A typical supercomputer lifecycle is about five to six years Read more…

MLPerf Training 4.0 – Nvidia Still King; Power and LLM Fine Tuning Added

June 12, 2024

There are really two stories packaged in the most recent MLPerf  Training 4.0 results, released today. The first, of course, is the results. Nvidia (currently king of accelerated computing) wins again, sweeping all nine Read more…

Highlights from GlobusWorld 2024: The Conference for Reimagining Research IT

June 11, 2024

The Globus user conference, now in its 22nd year, brought together over 180 researchers, system administrators, developers, and IT leaders from 55 top research computing centers, national labs, federal agencies, and univ Read more…

Shutterstock_666139696

Argonne’s Rick Stevens on Energy, AI, and a New Kind of Science

June 17, 2024

The world is currently experiencing two of the largest societal upheavals since the beginning of the Industrial Revolution. One is the rapid improvement and imp Read more…

Under The Wire: Nearly HPC News (June 13, 2024)

June 13, 2024

As managing editor of the major global HPC news source, the term "news fire hose" is often mentioned. The analogy is quite correct. In any given week, there are Read more…

Labs Keep Supercomputers Alive for Ten Years as Vendors Pull Support Early

June 12, 2024

Laboratories are running supercomputers for much longer, beyond the typical lifespan, as vendors prematurely deprecate the hardware and stop providing support. Read more…

MLPerf Training 4.0 – Nvidia Still King; Power and LLM Fine Tuning Added

June 12, 2024

There are really two stories packaged in the most recent MLPerf  Training 4.0 results, released today. The first, of course, is the results. Nvidia (currently Read more…

Highlights from GlobusWorld 2024: The Conference for Reimagining Research IT

June 11, 2024

The Globus user conference, now in its 22nd year, brought together over 180 researchers, system administrators, developers, and IT leaders from 55 top research Read more…

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

June 10, 2024

Nvidia had an explosive 2023 in data-center GPU shipments, which totaled roughly 3.76 million units, according to a study conducted by semiconductor analyst fir Read more…

ASC24 Expert Perspective: Dongarra, Hoefler, Yong Lin

June 7, 2024

One of the great things about being at an ASC (Asia Supercomputer Community) cluster competition is getting the chance to interview various industry experts and Read more…

HPC and Climate: Coastal Hurricanes Around the World Are Intensifying Faster

June 6, 2024

Hurricanes are among the world's most destructive natural hazards. Their environment shapes their ability to deliver damage; conditions like warm ocean waters, Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium

May 30, 2024

Consider the GPU. An island of SIMD greatness that makes light work of matrix math. Originally designed to rapidly paint dots on a computer monitor, it was then Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

Leading Solution Providers

Contributors

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's l Read more…

Intel’s Next-gen Falcon Shores Coming Out in Late 2025 

April 30, 2024

It's a long wait for customers hanging on for Intel's next-generation GPU, Falcon Shores, which will be released in late 2025.  "Then we have a rich, a very Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

Intel Plans Falcon Shores 2 GPU Supercomputing Chip for 2026  

August 8, 2023

Intel is planning to onboard a new version of the Falcon Shores chip in 2026, which is code-named Falcon Shores 2. The new product was announced by CEO Pat Gel Read more…

How the Chip Industry is Helping a Battery Company

May 8, 2024

Chip companies, once seen as engineering pure plays, are now at the center of geopolitical intrigue. Chip manufacturing firms, especially TSMC and Intel, have b Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire