ESnet Applying Global Networking Expertise to GRETA Spectrometer for Experiments at Michigan Facility

August 4, 2020

Aug. 4, 2020 — For decades, ESnet engineers have deployed the latest technologies and developed critical tools to build a high-speed network that crisscrosses the nation and spans the Atlantic Ocean. Now, a small team is doing the same for a specialized network that will transport and organize data across distances measured in feet rather than thousands of miles.

Nuclear physicists at Berkeley Lab are building the GRETA experiment, short for Gamma Ray Energy Tracking Array. The gamma ray detector will be installed at the Department of Energy’s Facility for Rare Isotope Beams (FRIB) located at Michigan State University in East Lansing.

The GRETA spectrometer will go online with first physics in 2024. When complete it will house an array of 120 detectors that will produce up to 480,000 messages per second — totaling 4 gigabytes of data per second — and send them through a computing cluster for analysis. While the data will traverse a network of about 50 meters, the system has been designed so the data could easily be sent to more distant high performance computing systems.

“We will be analyzing everything in real time, on the fly with no intermediate storage,” said Mario Cromaz, a Berkeley Lab physicist in charge of the computing component of GRETA. “We had an idea of how we wanted the computing to work, but it was also a networking problem and we didn’t have the technical where-with-all so we approached ESnet. That’s the kind of expertise we could only find at ESnet.”

ESnet network engineer Eli Dart, who is the computing system architect for the project, said ESnet agreed to help so that networking could be integrated into the project early in a way that is scalable and extensible. Dart also sees it as potentially the start of something even bigger — a system that is a building block for the “Superfacility” concept to seamlessly stitch together experiments, networks and computing resources.

“It’s a strategic experiment on ESnet’s part — if we can get in early and help with the design, we can try to help the experiment do things that would otherwise be very difficult,” Dart said. “In a deep collaboration like this, we can learn what’s important in the context of the experiment, and that can help us improve our services to the scientific community.”

First of its kind

A rendering of GRETA, the Gamma-Ray Energy Tracking Array. Image courtesy of Berkeley Lab.

GRETA is a gamma ray spectrometer, which will measure the energy of gamma rays created by nuclear collisions inside a compact sphere of high-purity germanium crystals with unprecedented resolution. It consists of a total of 120 highly segmented large-volume, coaxial germanium crystals, combined in groups of four to form a total of 30 Quad Detector Modules.

Cromaz said GRETA is the first of its kind in that it will track the positions of the scattering paths of the gamma rays using an algorithm specifically developed for the project. This capability will help scientists understand the structure of nuclei, which is not only important for understanding the synthesis of heavy elements in stellar environments, but also for applied-science topics in nuclear energy, nuclear forensics, and stockpile stewardship.

Since the excited nuclei emitting the gamma rays are moving very fast — at a large fraction of the speed of light — they create a Doppler effect. In order to accurately measure their energy, Cromaz said scientists need to know the angle the ray is coming from. The capability to do this is what makes GRETA unique.

The project team has just finished the design phase and the next formal project review will be in early August. To get this far, a one-quarter version of the experiment, called GRETINA, was built with prototypes to test the concepts and is currently performing experiments at Michigan State University. With a favorable August review, the GRETA team anticipates asking the DOE for approval to commence construction by the end of the fiscal year.

Bringing order to the data

According to Cromaz, the detectors built with field programmable gate arrays will spray out packets of data, which is relatively simple to do. The hard part is creating a buffer to catch the data, and to feed it into the network to the thousands of threads of computation running in the cluster for analysis.

“There are actually two phases to the analysis in the gamma ray tracking array,” Cromaz said. “The first phase is locating where the interaction points of the gamma ray with the detector material occurred and the second phase is looking at all interaction points globally in the detector and subdividing/ordering them into likely gamma ray tracks.”

The first computing stage derives the number and location of interaction points. This phase only depends on the digitized signals from a given detector crystal (there are 120 crystals which tile the sphere).

“In GRETA, it’s advantageous to arrange things this way as converting the raw digitized waveforms to interaction points — essentially a set of x, y, z coordinates and energies — reduces the data volume by an order of magnitude,” Cromaz said. “This reduces the load on the second phase, the global event builder, and allows us to implement it on a single node, which simplifies the overall design.”

Eric Pouyoul, who leads ESnet’s testbed efforts, designed the forward buffer to quickly collect the data, which will then be pulled into analysis jobs by the computing cluster. The forward buffer must receive the high-speed packet streams from 120 detectors with zero packet loss, and then feed the data to the cluster asynchronously.

Pouyoul said the project was challenging on a number of levels, from the physics involved to the nature of the data to the demands of real-time processing. The first step was to write the computing code and algorithms for handling the data. Although he has written high performance code in the past, this project required him to use other skills he’s developed over the years. Once he had the software, he needed to make sure it could handle the outpouring of data.

“The simulation of the crystals was relatively easy,” he said. “But the simulation of the physics–the nuclear behavior at the heart of GRETA–I never did anything like this before.”

Since not all of the crystals would detect every interaction, Pouyoul used a statistical-based model to recreate what would happen inside the detector. He also had to make the code efficient so it could run on the actual hardware GRETA will use. “I was able to build the model of the physics inside GRETA,” he said, “but don’t expect me to really understand it.”

“The first phase is the most computationally intensive part of the process as the maximum data generation rate is 480,000 calculations per second and each calculation requires about five milliseconds per CPU core, hence the requirement for a cluster.” Cromaz said.

From there, the data will then pass through a second system also designed by Pouyoul and called the “global event builder.” Using software written by Pouyoul, the system looks at the timestamps on all the incoming data and then reassembles them into a single stream of events ordered by the time stamps. Additionally, the algorithm also determines which event each piece of data belongs to and assembles them appropriately. This data will be stored for additional analysis based on timestamps and events.

“This has to happen in real time,” said Pouyoul, who called the project the most exciting work he has done in his 11 years at the lab. “Moving the data from the events through the system to storage cannot take more than 10 seconds.”

While the GRETA project has been gratifying for ESnet, it will also provide more experience toward developing the “Superfacility” concept developed by Berkeley Lab’s Computing Sciences organization. The Superfacility framework comprises the seamless integration of experimental and observational instruments with computational and data facilities using high-speed networking. While the concept is straightforward, achieving it requires resolving any number of smaller issues, which vary by facility.

“Because it was designed to be ultimately connected to the wider network, GRETA will be Superfacility-ready,” Dart said. “We see GRETA as a strategic experiment on ESnet’s part; if we get involved early, we can help with the design and help the experiment do things that otherwise could have been very difficult.

“The fun part of all this is that we would like to see GRETA be a proving ground for this type of environment and then see it be widely adopted,” Dart said. “In fact, we’re already received inquiries from other sites. If we can help others take advantage of what we’ve learned, then everybody wins.”

About ESnet

The Energy Sciences Network (ESnet) is a high-performance, unclassified network built to support scientific research. Funded by the U.S. Department of Energy’s Office of Science (SC) and managed by Lawrence Berkeley National Laboratory, ESnet provides services to more than 50 DOE research sites, including the entire National Laboratory system, its supercomputing facilities, and its major scientific instruments. ESnet also connects to 140 research and commercial networks, permitting DOE-funded scientists to productively collaborate with partners around the world.


Source: Jon Bashor, ESnet

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire