Berkeley Lab’s Networking Middleware GASNet Turns 20

December 12, 2022

Dec. 12, 2022 — In 2002, as massively parallel supercomputers were offering more computing power to programmers, a significant limitation to overall effectiveness was the time it took for processors to communicate with each other.

“Architectures were evolving quickly, and it was becoming clear that the dominant high-performance computing layer for communications, Message Passing Interface (MPI), was not well-suited at the time to take advantage of the Remote Memory Access (RMA) capabilities that were becoming available in network hardware,” said Dan Bonachea, Lawrence Berkeley National Laboratory (Berkeley Lab) Computer Systems Engineer.

Recognizing that RMA would become an important feature in future HPC systems, Bonachea, then a UC Berkeley graduate student, dedicated his CS258 Spring 2002 semester project to develop a solution to this problem. Building on the Active Messages (AM) paradigm developed at UC Berkeley a decade before, he designed GASNet (short for Global-Address Space Networking), a network-independent and language-independent high-performance communication interface for implementing the runtime system of global address space languages, such as UPC and Titanium.

He published the first GASNet Specification technical report in October of that year. He continued to build and improve on it as a graduate student researcher in Berkeley Lab’s Future Technologies Group and later as an engineer in the Lab’s Computer Languages and Systems Software Group, working in collaboration with Lab engineers and scientists, including Paul Hargrove and Katherine Yelick.

Shallow Water Tsunami Simulation solving shallow water Navier-Stokes equations, written using an Actor library communicating via UPC++ and GASNet-EX. Credit: A. Pöppl, M. Bader and S. Baden.

Twenty years later, GASNet is thriving with dozens of clients in academia, national laboratories, and industry. It was also recently upgraded to support exascale scientific applications via the Department of Energy’s Pagoda project. This new version, called GASNet-EX, supports scientific applications in drug discovery (NWChemEx), metagenomics research (ExaBiome), COVID-19 infection simulation (SIMCoV), and much more. Major companies also write software that depends on GASNet-EX, such as Hewlett Packard Enterprise’s (HPE’s) Chapel.

“It can take 20 years or more to develop a high-quality parallel-language compiler, which is a huge effort, and machines change much faster than that. With GASNet, the idea was to isolate the compiler writers from low-level hardware details. We aimed to provide them with a virtual interface to the communication layer so they could target something stable that is hardware- and network-independent, and let GASNet help it work efficiently on various HPC systems,” said Bonachea. “Many application developers don’t even know that GASNet exists. They might know that they’re programming in Chapel, but they don’t know that this cool embedded library handles the communications services.”

“Looking back at what we’ve accomplished with GASNet, I’m proud that we’ve helped so many application developers reach their goals by allowing them to skip designing and implementing a network runtime,” added Hargrove. “Most people who write software take pride in knowing that it’s being used, and we’re no exception. I’m proud that some two dozen projects over the years chose to incorporate GASNet because they were reading through the literature or doing a web search and realized that what we’ve built is useful to them.”

GASNet Going Strong Two Decades Later

Both Bonachea and Hargrove credit the success and longevity of GASNet to several factors: A dedicated and agile development team with an unwavering principled mission; their proximity and easy access to resources at the National Energy Research Scientific Computing Center (NERSC) and UC Berkeley; and consistent support of Berkeley Lab Senior Faculty Scientist Katherine Yelick, who was recently appointed UC Berkeley Vice Chancellor of Research, as well as continued funding from the Department of Energy’s Office for Advanced Scientific Computing Research.

Twenty years ago, Bonachea wasn’t the only graduate student that aimed to build a communication library that could take advantage of emerging RMA and AM capabilities. In fact, many of GASNet’s competitors at the time were developed by graduate students for their Master’s or Ph.D. thesis. What helped GASNet gain traction was Bonachea’s relationship with Berkeley Lab via Yelick, his adviser, and UC Berkeley’s and the Lab’s proximity to NERSC. With these connections, Bonachea had a specific set of users to design for and access to NERSC’s supercomputers, which gave him the resources to test his ideas.

“As Berkeley Lab Computer Systems Engineers, it’s our job to develop production-quality software for others. So unlike most graduate students, we think about longevity and maintainability. If an adjustment needs to be made three or five years after a project ends, we need to be able to go back into the software and understand the code we wrote at the beginning,” Hargrove said.

“I certainly didn’t think I’d still be working on GASNet 20 years later; as a graduate student, I was focused on the near-term of about three to five years,” said Bonachea. “But, I knew of at least three different software projects that relied on GASNet at the time, so I took a principled approach to follow good software engineering practices from the very beginning to ensure that I was making something usable and maintainable.”

According to Hargrove, another big part of GASNet’s success is that the right people used and advocated for it at the right time. “GASNet was built to meet the needs of locally based projects like the Berkeley UPC compiler suite, which Yelick led, so it was a foregone conclusion that they’d use it. But when people like Rice University’s John Mellor-Crummey incorporated GASNet into the Coarray Fortran programming model and HPE/Cray’s Bradford Chamberlain incorporated it into the Chapel programming language, it helped build up the reputation and momentum for our work,” he added.

“We began using GASNet because its features were an exact match for those that Chapel needs for inter-node communication: RMA and AM. Over time, the GASNet team at Berkeley proved to be dedicated to creating stable, useful, well-engineered software with terrific user support, so GASNet has only become more of a no-brainer for us to leverage and rely upon over the years,” said Chamberlain, a Distinguished Technologist at HPE/Cray who is the technical lead for the Chapel project.

Chamberlain added, “while my team develops communication libraries that directly target the unique networks developed by HPE and Cray, we rely heavily on GASNet-EX for portably supporting Chapel on networks developed by other vendors, such as InfiniBand. Moreover, since GASNet-EX also supports HPE/Cray networks, it provides an alternate implementation of Chapel on our systems that we can compare to or that users can opt into when desired.”

“One of the things that I really like about GASNet is that it has stayed true to its original purpose,” said Damian Rouson, who leads Berkeley Lab’s Computer Languages and Systems Software (CLaSS) Group. He also leads the development of the Caffeine parallel runtime library, which uses GASNet-EX and aims to support the parallel Fortran features of modern Fortran compilers with the LLVM Flang compiler as the initial target.

According to Rouson, the fact that Bonachea and Hargrove have continually been part of GASNet’s development team over the years provides the project with a sense of stability and a clear purpose. The relatively small development team has also allowed them to remain agile in adapting and maintaining the tool to meet their client’s needs and challenges presented by evolving HPC architectures.

“From the very start, we decided that GASNet focuses on one-sided RMA and AM; we’ve been very clear-eyed about what we do and what we do well; we drew a box around that and have chosen our battles,” said Bonachea. “We also made backward-compatibility one of our main priorities. If compiler developers decide to invest in calling on GASNet, we’re not going to break that arbitrarily.”

According to Bonachea, the GASNet development team has generated many new features over the years, but their philosophy doesn’t force client runtimes to use them. Although GASNet-EX offers features optimized for exascale computing applications, he notes that code written for GASNet 1.0 two decades ago would still do the right thing on an exascale system.

“The backward compatibility aspect of the design is what I really value most about GASNet: the fact that people don’t have to rewrite their codes with each new version,” said Rouson. “My experience serving on the standard committee for Fortran, which is now Medicare age (65 years old), has impressed upon me the importance of backward compatibility for software sustainability and longevity.”

“With GASNet-EX, our code has passed the 20-year test of time,” said Bonachea. “Over the years, our small team had to make ad-hoc decisions without industry-wide consensus about how things should work in a communications layer because we needed to be adaptable to the needs of our clients. Despite this, we’ve made relatively few design mistakes, and our clients still find our tool useful. That’s something that I’m extremely proud of.”


Source: Linda Vu, Berkeley Lab

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Intel Labs Fights Silent Data Corruption with Computational Storage

May 30, 2024

Artificial Intelligence (AI) Large Language Models and other forms of Deep Learning already require enormous amounts of training data. The data volumes are expected to grow as more organizations implement AI. This situat Read more…

Natcast/NSTC Issues Roadmap to Implement CHIPS and Science Act

May 29, 2024

Yesterday, CHIPS for America and Natcast, the operator of the National Semiconductor Technology Center (NSTC), released a roadmap of early steps for implementing portions of the ambitious $5 billion program. Natcast is t Read more…

Scientists Use GenAI to Uncover New Insights in Materials Science

May 29, 2024

With the help of generative AI, researchers from MIT and the University of Basel in Switzerland have developed a new machine-learning framework that can help uncover new insights about materials science. The findings of Read more…

Microsoft’s ARM-based CPU Cobalt will Support Windows 11 in the Cloud

May 29, 2024

Microsoft's ARM-based CPU, called Cobalt, is now available in the cloud for public consumption. Cobalt is Microsoft's first homegrown CPU, which was first announced six months ago. The cloud-based Cobalt VMs will support Read more…

2024 Winter Classic Finale! Gala Awards Ceremony

May 28, 2024

We wrapped up the competition with our traditional Gala Awards Ceremony. This was an exciting show, given that only 40 points or so separated first place from fifth place after the Google GROMACS Challenge and heading in Read more…

IBM Makes a Push Towards Open-Source Services, Announces New watsonx Updates

May 28, 2024

Today, IBM declared that it is releasing a number of noteworthy changes to its watsonx platform, with the goal of increasing the openness, affordability, and flexibility of the platform’s AI capabilities. Announced Read more…

Intel Labs Fights Silent Data Corruption with Computational Storage

May 30, 2024

Artificial Intelligence (AI) Large Language Models and other forms of Deep Learning already require enormous amounts of training data. The data volumes are expe Read more…

Scientists Use GenAI to Uncover New Insights in Materials Science

May 29, 2024

With the help of generative AI, researchers from MIT and the University of Basel in Switzerland have developed a new machine-learning framework that can help un Read more…

watsonx

IBM Makes a Push Towards Open-Source Services, Announces New watsonx Updates

May 28, 2024

Today, IBM declared that it is releasing a number of noteworthy changes to its watsonx platform, with the goal of increasing the openness, affordability, and fl Read more…

ISC 2024 Takeaways: Love for Top500, Extending HPC Systems, and Media Bashing

May 23, 2024

The ISC High Performance show is typically about time-to-science, but breakout sessions also focused on Europe's tech sovereignty, server infrastructure, storag Read more…

ISC 2024 — A Few Quantum Gems and Slides from a Packed QC Agenda

May 22, 2024

If you were looking for quantum computing content, ISC 2024 was a good place to be last week — there were around 20 quantum computing related sessions. QC eve Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's l Read more…

Europe’s Race towards Quantum-HPC Integration and Quantum Advantage

May 16, 2024

What an interesting panel, Quantum Advantage — Where are We and What is Needed? While the panelists looked slightly weary — their’s was, after all, one of Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Intel Plans Falcon Shores 2 GPU Supercomputing Chip for 2026  

August 8, 2023

Intel is planning to onboard a new version of the Falcon Shores chip in 2026, which is code-named Falcon Shores 2. The new product was announced by CEO Pat Gel Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

How the Chip Industry is Helping a Battery Company

May 8, 2024

Chip companies, once seen as engineering pure plays, are now at the center of geopolitical intrigue. Chip manufacturing firms, especially TSMC and Intel, have b Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire