Sept. 5, 2023 — The U.S. Department of Energy (DOE) has announced its selection of a multi-institutional team of data scientists from General Atomics (GA), the San Diego Supercomputer Center (SDSC) and UC San Diego, Hewlett Packard Enterprise (HPE) and Sapientai to develop a Fusion Data Platform (FDP) for advancing high-priority fusion research. In support of this effort the DOE awarded the team a three-year, $7.4 million grant.
Led by GA, the FDP initially will be deployed at SDSC, located at the University of California San Diego. Once completed, the FDP will be made available to the scientific community to provide access to high-quality fusion data for the efficient creation of reproducible artificial intelligence (AI)/machine learning (ML) models to support the design and operation of a broad range of fusion pilot plants (FPP) designs and plasma configurations within a decadal timescale.
A suite of AI/ML modeling capabilities developed by Sapientai and UC San Diego computer science and engineering faculty Rose Yu and Sicun Gao will be integrated with the platform, allowing it to serve as a powerful data and analysis tool that meets the growing needs of the fusion science community.
“Creating a robust AI/ML platform with very large curated datasets and efficient processing tools will be transformational for fusion energy,” said Brian Sammuli, head of the Fusion Data Science Center at GA and principal investigator. “By advancing AI/ML research in fusion, we will be able to rapidly address many of the remaining challenges in fusion science and reactor development. We look forward to leading this team to provide an outstanding platform for the scientific community to advance fusion research and support the deployment of the first generation of fusion energy power plants.”
According to Raffi Nazikian, senior director and leader of the ITER Research Hub at GA, a key mission of the FDP is to accelerate AI/ML research by expanding access to high-quality fusion data and the tools needed to process the data at scale.
“The FDP will include experimental and simulated data in an integrated platform. We are talking many petabytes of data that will be easily accessible on the platform,” said Nazikian. “The success of the FDP will be measured by how well we serve the needs of the fusion and broader data science community, including students and researchers from universities, national laboratories and industry.”
SDSC Director Frank Würthwein, professor in the Department of Physics and at the Halıcıoğlu Data Science Institute at UC San Diego, said that the FDP is an important step toward harnessing the power of fusion data to advance the development of fusion energy.
“GA and SDSC have a long history dating back almost 40 years, and this is the beginning of a new chapter in our cooperation to advance fusion energy science and education,” Würthwein noted.
Paolo Faraboschi, HPE fellow and AI Research Lab director at Hewlett Packard Labs, said that his team is excited to help build a powerful data platform for fusion. “Among the FDP unique capabilities will be the ability for users to access, understand and leverage prior data and AI pipelines to advance their research and build reproducible, certifiable AI/ML models. We look forward to working with the scientific community on the FDP to help realize the decadal vision for fusion energy development.”
Craig Michoski, founder and CEO of Sapientai, also noted his group’s excitement to participate in the FDP project. “This is a phenomenal set of collaborative institutions, and we have high aspirations for the success and impact the FDP project will have across the fusion landscape,” he said. “We think the era of data-driven science and technology advancement is well upon us, and we are extremely excited to see how these tools applied to the treasure trove of DOE’s fusion data can advance the field and accelerate progress towards commercial fusion energy.”
Supporting Data-Informed FPP Designs
To achieve fusion conditions relevant for energy production, an FPP must sustain plasmas at temperatures exceeding 100 million degrees Celsius—approximately ten times the temperature at the center of the sun. In magnetic confinement fusion, plasmas are controlled using powerful electromagnets that shape and confine the superheated gas. At such extreme temperatures, the plasmas may exhibit instabilities that may cause them to momentarily breach the magnetic fields and interact with the inner walls of the fusion machine, which could decrease efficiency or even cause damage. Successfully designing FPPs that account for these and other types of instabilities requires robust data sets to model and predict plasma behaviors across designs.
The FDP will help to address this need by making large-scale fusion data easier to access and analyze. The multi-institutional team will draw from its significant AI/ML industry expertise to develop the FDP as a resource capable of being collectively utilized across distributed computational facilities.
The FDP will leverage GA’s scaleable, fusion-specific data processing tool, TokSearch, to process and curate the data sets at the required scale. The team will also draw from HPE’s Common Metadata Framework to create reproducible workflows that include metadata tracking, source code integration, and data version control. A publishing portal will be incorporated into the system to facilitate search and discovery of these curated datasets. A suite of AI/ML modeling capabilities developed by Sapientai and UC San Diego will be integrated with the platform, allowing it to serve as a powerful data and analysis tool that meets the growing needs of the fusion science community.
About the Team
The San Diego Supercomputer Center was established in 1985 as one of the nation’s first supercomputer centers under a cooperative agreement by the National Science Foundation in collaboration with UC San Diego and GA. SDSC provides resources, services and expertise to the national research community, including industry and academia, and features the Expanse, Voyager and National Research Platform supercomputers and innovative computing systems. Expanse supports SDSC’s theme of “Computing without Boundaries” with a data-centric architecture, public cloud integration and state-of-the art GPUs for incorporating experimental facilities and edge computing. The first-of-its-kind experimental system with training and inference accelerators to provide high-performance, high-efficiency AI compute, Voyager supports AI research across a range of science and engineering domains. The National Research Platform provides a nationally distributed data and compute platform with GPUs and FPGAs for AI, and a content delivery system with data caches in the internet backbone across four continents.
Hewlett Packard Enterprise is the global edge-to-cloud company that helps organizations accelerate outcomes by unlocking value from all their data, everywhere. Built on decades of reimagining the future and innovating to advance the way people live and work, HPE delivers unique, open and intelligent technology solutions as a service.
Sapientai LLC combines ML and AI with data-intensive science, notably in nuclear fusion and plasma physics. They provide versatile software solutions, including off-the-shelf applications as well as tailored services. With a firm belief in collaboration, Sapientai encourages innovative research partnerships. Their work aligns with the Department of Energy’s mission, committed to advancing scientific frontiers.
Since the dawn of the atomic age, General Atomics innovations have advanced the state of the art across the full spectrum of science and technology – from nuclear energy and defense to medicine and high-performance computing. Behind a talented global team of scientists, engineers, and professionals, GA’s unique experience and capabilities continue to deliver safe, sustainable, economical, and innovative solutions to meet growing global demands.
Source: Cynthia Dillon, SDSC