This summer, over 40 undergraduate and graduate students collaborated with ALCF mentors on innovative research projects, gaining firsthand experience with high-performance computing (HPC) and artificial intelligence (AI) for science.
Oct. 7, 2024 — Every summer, the Argonne Leadership Computing Facility (ALCF), a U.S. Department of Energy (DOE) Office of Science user facility at DOE’s Argonne National Laboratory, hosts a new group of students to gain experience working on real-world scientific computing research.
“Our summer students get to work closely with experts in HPC on projects that harness powerful supercomputing and AI resources,” said Michael Papka, ALCF director and professor of computer science at the University of Illinois Chicago (UIC). “This collaborative environment helps these students advance their skills, preparing them to join the next-generation HPC and AI workforce.”
This year, over 40 students contributed to ALCF projects spanning from scaling up deep learning benchmark applications for exascale computing to compressing data for AI models that can give us insights into nuclear fusion. We spoke to four students about their work and experiences this summer at the ALCF.
Developing Digital Twins for Dexterous Robots
Athena Angara, a rising senior studying data science at UIC, used virtual reality (VR) and augmented reality (AR) to help create digital twins of critical components of robotic arms. These models allow researchers to explore experiments that are generally considered too time-intensive or hazardous, and determine whether it’s possible to complete these experiments remotely and automatically.
Using Unreal Engine 5 and NVIDIA’s Unity, Omniverse, and Isaac Sim, Angara developed an immersive display system characterized by real-time, low latency, and high spatial resolution for remote control processes. She also converted data to ensure accurate representations between these digital platforms.
“I had several breakthrough moments during my research,” Angara said. “One of the most notable was figuring out how to send joint angles and all the data from the robot to Omniverse. I discovered that by implementing a custom data serialization protocol combined with a real-time data streaming framework, we could achieve seamless integration between the robot and the virtual environment. This allowed for accurate and responsive control within the simulation.”
For Angara, interdisciplinary collaboration was a cornerstone of her summer at the ALCF. Along with her ALCF mentor, Victor Mateevitsi, Angara worked with Silvio Rizzo, Joe Insley, Yunghuo Kim, Nicola Ferrier, and Papka. “Collaborating with experts from various fields has shown me the power of teamwork and diverse perspectives,” she said. “This experience has equipped me with the confidence and skills to tackle complex challenges in my future career. I now feel more prepared and inspired to pursue innovative projects that can make a real difference.”
Scaling Up Applications for Aurora
Colin Luangrath’s work with the ALCF team centered on scaling the DLIO benchmark application, a tool that emulates the I/O patterns of modern scientific deep learning applications. He helped scale DLIO for use cases on Aurora, identified bottlenecks, and found solutions to mitigate issues with scaling to multiple nodes.
A rising sophomore studying computer science and psychology at the University of Wisconsin-Madison, Luangrath appreciated the opportunity to work with Argonne’s powerful supercomputing resources.
“Working on Polaris has been an incredible experience. I’ve learned a lot about HPC that I never would have experienced in any other field,” Luangrath said. “I felt like I had freedom to try things out and get hands-on experience with these systems and more time to learn about the technical side of things. It felt very collaborative.”
Luangrath said working with his ALCF mentor, Huihuo Zheng, challenged him to become a better problem solver. “Before this experience, I often made decisions without really thinking about what was going on, and just focused on fixing the problem. My mentor taught me a lot about debugging. I learned about investigating deeply and understanding why things are or aren’t working, rather than trying to find a quick workaround.”
Machine Learning for Molecular Dynamics Research
Hariharan Ramasubramanian, a Ph.D. student in mechanical engineering from Carnegie Mellon University (CMU), has focused his studies on computational materials modeling. This summer at the ALCF, he was able to explore new types of problems in his field, focusing on machine learning potentials (MLPs) for long-range systems.
In molecular dynamics simulations, MLPs are effective at modeling the structures and dynamics of complex systems. However, they struggle to handle charged species or systems with magnetization where the non-local effects become a major driving factor.
“It’s challenging to model a charged system with an interaction which has a land scale longer than its neighbors,” Ramasubramanian said. “In an ionic system, for instance, one might predict partial charges for each atom, which are then used to calculate long-range electrostatics. Similarly, various empirical electrostatics and dispersion baseline corrections can be incorporated. Our model can help address long-range interactions in such atomistic systems.”
Alongside his ALCF mentor, Álvaro Vázquez-Mayagoitia, Ramasubramanian researched introducing partial charges into the MLPs’ learnable descriptors.
“Working at the ALCF exposed me to new types of problems,” Ramasubramanian said. “Long-range interactions are not something we focus on at CMU. I got new understandings of materials modeling from this project.”
This summer also gave Ramasubramanian new insights into his career. “I got to see how national labs are different from academic settings. There’s more collaboration across fields, and it’s easier to meet new people,” he said.
Solving Problems in Nuclear Fusion with AI-enabled Disruption Prediction
In nuclear fusion research, the leading experimental device is the tokamak, which aims to achieve sustained reactions by using powerful magnets to confine hot plasma. Electron Cyclotron Emission Imaging (ECEI) is a diagnostic technique that provides detailed information on the confined plasma. Integrating ECEI data with machine learning models can enable researchers to predict future plasma states. This approach is crucial for minimizing disruptions that could damage the tokamak and halt experiments.
This summer, Apollo Lee, a rising junior studying electrical engineering at Stanford University, leveraged ALCF’s HPC resources to identify ways to better integrate ECEI data with predictive models. Alongside his ALCF mentor, Kyle Felker, Lee explored the application of both next-generation lossless and lossy compressors and signal filtering techniques.
“It’s a huge amount of data—terabytes upon terabytes,” Lee said. “If you’re looking to train machine learning models with it, it’s worth seeing how you could scale it down—of course, without losing any important information.”
Lee particularly enjoyed the HPC aspect of this project. “Working with Polaris has been awesome. It’s amazing to see what these supercomputers are capable of—they handle these huge datasets with ease.”
Reflecting on his summer, Lee said, “Working at Argonne this summer was an incredible experience. I was able to meet a lot of really talented and passionate people, and it kept things exciting every day. As someone who loves learning and tackling new challenges, and during my time here, I felt right at home.”
Source: Rachel Taub, Argonne