ESnet6 Launches, Heralding a New Era in Scientific Networking

By Oliver Peckham

October 11, 2022

The launch of ESnet6 was announced at an event at Berkeley Lab this morning. ESnet – short for “energy sciences network” – is managed by Berkeley Lab, funded by the DOE’s Office of Science and provides high-speed networking services to dozens of DOE research sites, including all of the National Labs. The latest iteration of the network, ESnet6, is the product of six years of hard work, much of which transpired during a pandemic. The upgrade includes a massively enhanced and expanded fiber optic backbone, better network automation and more.

The need for ESnet6

“The reason why we embarked on the facility upgrade … is we were seeing exponential increases in data,” explained Inder Monga, executive director of ESnet and division director for scientific networking at Berkeley Lab, at the September meeting of the Advanced Scientific Computing Advisory Committee (ASCAC) a couple of weeks ago. “We moved around 1.1 exabytes of data last year, and the data rate has been going somewhere between 50-60% year-on-year growth every year since 1990.”

Beyond sheer capacity needs, Monga explained that they wanted to improve the network’s resiliency as the productivity of the National Labs (and DOE-funded science generally) increasingly relied on networks – a reality only accelerated by the arrival of Covid. Finally, he cited the rapid convergence of multiple facilities, simulations, experimental data and more. “We wanted to make sure that this network supported these new scientific workflows through automation and programmability, [ensuring] our ability to create custom services that may be specific to a project, to a science workflow,” Monga said.

A difficult task completed ahead of schedule

Achieving these three goals through ESnet6 – capacity, resilience and automation – was a “transformative and challenging” process, Monga continued. It was a project filled with firsts: it was ESnet’s first project developed under the DOE’s 413.3B project management process; the first greenfield design and build of the entire network conducted by the ESnet team itself; and the first time the ESnet team had implemented and operated the optical layer, a process run through partners in previous iterations. And, beyond those firsts, more than 50% of the project team was hired during the project, the pandemic hit immediately after the design phase (which Monga said resulted in a “10× increase” in coordination, communication and reporting), and the project was allocated zero unplanned downtime and very limited off-hours, planned downtime.

But, remarkably, after around six years of conceptualization and execution (the first steps were taken around December 2016), ESnet6 is (by and large) launching even earlier than the team might have hoped. “We finished ahead of time and under-budget,” Monga said. He attributed that success to a few things, such as spending around 18 months on design (“it helped us a lot”) and building the optical infrastructure as soon as possible. The team’s preparedness meant that, when the pandemic hit, the significant delays that did occur – such as the installation of new routing equipment – were not debilitating. “Our critical path was not delayed,” Monga said.

The ESnet6 timeline. Image courtesy of Inder Monga.

The result, by the numbers: 15,000 miles of dark fiber lit up; 300 leased spaces that now house ESnet-owned equipment across the U.S. (optical amplifiers, spaced every 80km or so along the lines); 46.1Tbps aggregate deployed capacity; services ranging from 400Gbps to 1Tbps. The network is operating on a 20-year lease with an extension to 30 years. The ESnet team also replaced all of its backbone routers and all routers at connected sites, opting for commercial offerings (prominently, the Nokia 7750 routers) rather than white-labeling. (Additional hardware partners: AMD, which provided its Alveo FPGA-based network-attached accelerator cards; Ciena; Infinera; and Lumen.) More than 70 new routers were installed, while 53 were decommissioned – all without an interruption in service.

A map of ESnet6 (click to expand). Image courtesy of the DOE.

“The thing that I find frankly astonishing about ESnet6 is that Inder Monga and his team have been able to implement this system – add new fiber and repurpose older fiber – and at the same time serve all the needs that ESnet5 needed to serve and continued to serve while they were building ESnet6,” said Vint Cerf, vice president and chief internet evangelist for Google (and “father of the internet”), in his keynote for the launch event.

Meaningful enhancements

Cerf also lauded ESnet6’s “ability to dynamically allocate its resources to meet various and sundry kinds of demands – some of which have quite high variation.” To that end, Monga said, the ESnet team was also tasked with building a network where individual routers didn’t require bespoke treatment and documentation from a range of engineers. “Before ESnet6, … each network device was treated as a pet, and not cattle,” Monga said, explaining how network devices would each require special treatment and know-how to successfully operate.

ESnet6, in contrast, uses its “Orchestrator” to intelligently automate ESnet’s configuration and redirect massive scientific data flows. A new platform also allows ESnet staff to monitor more of the data packets while they’re traveling along the lines, allowing, again, for more proactive and ubiquitous management of the network. ESnet6 also includes upgraded security features, namely “black hole” routing that allows ESnet to block traffic to or from any source without interrupting any of the other network traffic.

The upgraded network has already borne fruit.

“Something we just did very recently was replicate the entire contents of the Intergovernmental Panel on Climate Change CMIP archive from Livermore Lab to Argonne and Oak Ridge,” said Ian Foster, senior scientist and distinguished fellow at Argonne National Laboratory, in his keynote for the launch event. That many-petabyte transfer (enabled by Globus), he said, had taken three months – accelerating as the network matured – and completed in a completely automated manner without any human interaction. Foster lauded the network’s possibilities, highlighting how sensors and computational resources across the country could now be more seamlessly interconnected. “Now we know the network can be totally relied on, we can start looking at using it to do new things.”

“As scientific instruments grow in complexity and supercomputers simulate scientific phenomena at higher resolutions, the science community is facing a growing challenge: data volumes that are increasing exponentially, coupled with the need to move, share, and process this data globally and faster than ever before,” said Barbara Helland, associate director of the DOE Office of Science’s Advanced Scientific Computing Research program. “With ESnet6, DOE researchers are equipped with the most sophisticated technology to help tackle the grand challenges we face today in areas like climate science, clean energy, semiconductor production, microelectronics, the discovery of quantum information science and more.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

From Exasperation to Exascale: HPE’s Nic Dubé on Frontier’s Untold Story

December 2, 2022

The Frontier supercomputer – still fresh off its chart-topping 1.1 Linpack exaflops run and maintaining its number-one spot on the Top500 list – was still very much in the spotlight at SC22 in Dallas last month. Six Read more…

At SC22, Carbon Emissions and Energy Costs Eclipsed Hardware Efficiency

December 2, 2022

The race to ever-better flops-per-watt and power usage effectiveness (PUE) has, historically, dominated the conversation over sustainability in HPC – but at SC22, held last month in Dallas, something felt different. Ac Read more…

HPC Career Notes: December 2022 Edition

December 1, 2022

In this monthly feature, we’ll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it’s a promotion, new company hire, or even an accolade, we’ Read more…

IBM Quantum Summit: Osprey Flies; Error Handling Progress; Quantum-centric Supercomputing

December 1, 2022

Part scorecard, part grand vision, IBM’s annual Quantum Summit held last month is a fascinating snapshot of IBM’s progress, evolving technology roadmap, and issues facing the quantum landscape broadly. Thankfully, IB Read more…

AWS Introduces a Flurry of New EC2 Instances at re:Invent

November 30, 2022

AWS has announced three new Amazon Elastic Compute Cloud (Amazon EC2) instances powered by AWS-designed chips, as well as several new Intel-powered instances – including ones targeting HPC – at its AWS re:Invent 2022 Read more…

AWS Solution Channel

Shutterstock 110419589

Thank you for visiting AWS at SC22

Accelerate high performance computing (HPC) solutions with AWS. We make extreme-scale compute possible so that you can solve some of the world’s toughest environmental, social, health, and scientific challenges. Read more…

 

shutterstock_1431394361

AI and the need for purpose-built cloud infrastructure

Modern AI solutions augment human understanding, preferences, intent, and even spoken language. AI improves our knowledge and understanding by delivering faster, more informed insights that fuel transformation beyond anything previously imagined. Read more…

Quantum Riches and Hardware Diversity Are Discouraging Collaboration

November 28, 2022

Quantum computing is viewed as a technology for generations, and the spoils for the winners are huge, but the diversity of technology is discouraging collaboration, an Intel executive said last week. There are close t Read more…

From Exasperation to Exascale: HPE’s Nic Dubé on Frontier’s Untold Story

December 2, 2022

The Frontier supercomputer – still fresh off its chart-topping 1.1 Linpack exaflops run and maintaining its number-one spot on the Top500 list – was still v Read more…

At SC22, Carbon Emissions and Energy Costs Eclipsed Hardware Efficiency

December 2, 2022

The race to ever-better flops-per-watt and power usage effectiveness (PUE) has, historically, dominated the conversation over sustainability in HPC – but at S Read more…

HPC Career Notes: December 2022 Edition

December 1, 2022

In this monthly feature, we’ll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it Read more…

IBM Quantum Summit: Osprey Flies; Error Handling Progress; Quantum-centric Supercomputing

December 1, 2022

Part scorecard, part grand vision, IBM’s annual Quantum Summit held last month is a fascinating snapshot of IBM’s progress, evolving technology roadmap, and Read more…

AWS Introduces a Flurry of New EC2 Instances at re:Invent

November 30, 2022

AWS has announced three new Amazon Elastic Compute Cloud (Amazon EC2) instances powered by AWS-designed chips, as well as several new Intel-powered instances Read more…

Quantum Riches and Hardware Diversity Are Discouraging Collaboration

November 28, 2022

Quantum computing is viewed as a technology for generations, and the spoils for the winners are huge, but the diversity of technology is discouraging collaborat Read more…

2022 HPC Road Trip: Los Alamos

November 23, 2022

With SC22 in the rearview mirror, it’s time to get back to the 2022 Great American Supercomputing Road Trip. To refresh everyone’s memory, I jumped in the c Read more…

QuEra’s Quest: Build a Flexible Neutral Atom-based Quantum Computer

November 23, 2022

Last month, QuEra Computing began providing access to its 256-qubit, neutral atom-based quantum system, Aquila, from Amazon Braket. Founded in 2018, and built o Read more…

Nvidia Shuts Out RISC-V Software Support for GPUs 

September 23, 2022

Nvidia is not interested in bringing software support to its GPUs for the RISC-V architecture despite being an early adopter of the open-source technology in its GPU controllers. Nvidia has no plans to add RISC-V support for CUDA, which is the proprietary GPU software platform, a company representative... Read more…

RISC-V Is Far from Being an Alternative to x86 and Arm in HPC

November 18, 2022

One of the original RISC-V designers this week boldly predicted that the open architecture will surpass rival chip architectures in performance. "The prediction is two or three years we'll be surpassing your architectures and available performance with... Read more…

AWS Takes the Short and Long View of Quantum Computing

August 30, 2022

It is perhaps not surprising that the big cloud providers – a poor term really – have jumped into quantum computing. Amazon, Microsoft Azure, Google, and th Read more…

Chinese Startup Biren Details BR100 GPU

August 22, 2022

Amid the high-performance GPU turf tussle between AMD and Nvidia (and soon, Intel), a new, China-based player is emerging: Biren Technology, founded in 2019 and headquartered in Shanghai. At Hot Chips 34, Biren co-founder and president Lingjie Xu and Biren CTO Mike Hong took the (virtual) stage to detail the company’s inaugural product: the Biren BR100 general-purpose GPU (GPGPU). “It is my honor to present... Read more…

AMD Thrives in Servers amid Intel Restructuring, Layoffs

November 12, 2022

Chipmakers regularly indulge in a game of brinkmanship, with an example being Intel and AMD trying to upstage one another with server chip launches this week. But each of those companies are in different positions, with AMD playing its traditional role of a scrappy underdog trying to unseat the behemoth Intel... Read more…

Tesla Bulks Up Its GPU-Powered AI Super – Is Dojo Next?

August 16, 2022

Tesla has revealed that its biggest in-house AI supercomputer – which we wrote about last year – now has a total of 7,360 A100 GPUs, a nearly 28 percent uplift from its previous total of 5,760 GPUs. That’s enough GPU oomph for a top seven spot on the Top500, although the tech company best known for its electric vehicles has not publicly benchmarked the system. If it had, it would... Read more…

JPMorgan Chase Bets Big on Quantum Computing

October 12, 2022

Most talk about quantum computing today, at least in HPC circles, focuses on advancing technology and the hurdles that remain. There are plenty of the latter. F Read more…

Using Exascale Supercomputers to Make Clean Fusion Energy Possible

September 2, 2022

Fusion, the nuclear reaction that powers the Sun and the stars, has incredible potential as a source of safe, carbon-free and essentially limitless energy. But Read more…

Leading Solution Providers

Contributors

UCIe Consortium Incorporates, Nvidia and Alibaba Round Out Board

August 2, 2022

The Universal Chiplet Interconnect Express (UCIe) consortium is moving ahead with its effort to standardize a universal interconnect at the package level. The c Read more…

Nvidia, Qualcomm Shine in MLPerf Inference; Intel’s Sapphire Rapids Makes an Appearance.

September 8, 2022

The steady maturation of MLCommons/MLPerf as an AI benchmarking tool was apparent in today’s release of MLPerf v2.1 Inference results. Twenty-one organization Read more…

SC22 Unveils ACM Gordon Bell Prize Finalists

August 12, 2022

Courtesy of the schedule for the SC22 conference, we now have our first glimpse at the finalists for this year’s coveted Gordon Bell Prize. The Gordon Bell Pr Read more…

Intel Is Opening up Its Chip Factories to Academia

October 6, 2022

Intel is opening up its fabs for academic institutions so researchers can get their hands on physical versions of its chips, with the end goal of boosting semic Read more…

AMD’s Genoa CPUs Offer Up to 96 5nm Cores Across 12 Chiplets

November 10, 2022

AMD’s fourth-generation Epyc processor line has arrived, starting with the “general-purpose” architecture, called “Genoa,” the successor to third-gen Eypc Milan, which debuted in March of last year. At a launch event held today in San Francisco, AMD announced the general availability of the latest Epyc CPUs with up to 96 TSMC 5nm Zen 4 cores... Read more…

AMD Previews 400 Gig Adaptive SmartNIC SOC at Hot Chips

August 24, 2022

Fresh from finalizing its acquisitions of FPGA provider Xilinx (Feb. 2022) and DPU provider Pensando (May 2022) ), AMD previewed what it calls a 400 Gig Adaptive smartNIC SOC yesterday at Hot Chips. It is another contender in the increasingly crowded and blurry smartNIC/DPU space where distinguishing between the two isn’t always easy. The motivation for these device types... Read more…

Google Program to Free Chips Boosts University Semiconductor Design

August 11, 2022

A Google-led program to design and manufacture chips for free is becoming popular among researchers and computer enthusiasts. The search giant's open silicon program is providing the tools for anyone to design chips, which then get manufactured. Google foots the entire bill, from a chip's conception to delivery of the final product in a user's hand. Google's... Read more…

Not Just Cash for Chips – The New Chips and Science Act Boosts NSF, DOE, NIST

August 3, 2022

After two-plus years of contentious debate, several different names, and final passage by the House (243-187) and Senate (64-33) last week, the Chips and Science Act will soon become law. Besides the $54.2 billion provided to boost US-based chip manufacturing, the act reshapes US science policy in meaningful ways. NSF’s proposed budget... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire