Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

By Tracey Bryant

January 5, 2021

Jan. 5, 2021 — What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task.

Chandrasekaran, assistant professor of computer and information sciences, recently was named the David L. Mills and Beverly J.C. Mills Career Development Chair at UD. This professorship, funded through a generous gift from David L. Mills, professor emeritus, and Beverly J.C. Mills, a UD alumna, was created to reward exceptional young female faculty in the departments of electrical and computer engineering or computer and information sciences.

For the past year, Chandrasekaran has been leading one of eight teams working on applications for the new Frontier supercomputer being built within the U.S. Department of Energy’s Oak Ridge National Laboratory (ORNL) in Tennessee. This exascale computer, capable of performing a mind-boggling quintillion calculations per second — that’s a 1 with 18 zeros after it (1,000,000,000,000,000,000) — is expected to launch in 2021. It will be at least five times faster than ORNL’s current supercomputer, Summit, which was the world’s fastest supercomputer until Japan’s Fugaku came online this past summer.

University of Delaware Prof. Sunita Chandrasekaran is leading an international team designing an application for the Frontier exascale supercomputer, now being built at Oak Ridge National Laboratory. Photo courtesy of Sunita Chandrasekaran

Chandrasekaran’s team is working with a plasma physics application called PIConGPU (Particle in Cell), which can simulate interactions between lasers and matter. Enlisting Frontier’s massive computing power, the team is working to generate fast, predictive simulations for next-generation plasma (particle) accelerators. Such tools are critical to advancing radiation therapies for cancer, as well as expanding the use of X-rays to probe the structure of materials.

Her collaborators have had high praise for the team effort.

“Dr. Chandrasekaran’s PIConGPU team is an elite group spanning many geographic regions, scientific domains and backgrounds,” said Dr. Nicolas Malaya, technical lead from Advanced Micro Devices (AMD) for the Exascale Centers of Excellence. “I fully expect this application to generate important scientific results from this team in computational science, supercomputing and plasma physics.”

Dr. Michael Bussmann, head of the Center for Advanced Systems Understanding (CASUS) at HZDR, a research laboratory based in Germany, added: “Together with the University of Delaware and our partners at Helmholtz-Zentrum Dresden-Rossendorf, CASUS scientists are working at the frontier of high performance computing. Our solutions will enable realistic simulations for next generation particle accelerators based on plasma technologies.”

UDaily, a publication of the University of Delaware, recently connected with Chandrasekaran for an update on the team’s work.

Q: How is the project going?

Chandrasekaran: Pretty fantastic. We are thrilled to have gotten access to the new AMD Instinct MI100 accelerator cards from AMD. We ran the full PIConGPU on these newly released cards, and in our studies using a single GPU, we observed a 1.4 times increase in speed compared to MI60. This is promising and gives us a lot to look forward to, for the next-generation CPUs and GPUs for Frontier.

The team is using accelerator cards like this from Advanced Micro Devices (AMD) to speed the processing of plasma simulations and perform other intensive calculations. Photo courtesy AMD.

Q: Speaking of CPUs and GPUs, how do you describe the basic differences between them? 

Chandrasekaran: In general, CPUs, or central processing units, are the workhorses of computing systems. In the recent past, these systems have been upgraded with GPUs — graphic processing units — which were first used in gaming applications but are now mainstream in high-performance computing, big data and analytics kind of problems. Let’s take painting as an analogy. While painting with watercolors is just fine, imagine using gouache to enhance certain portions of your painting — now those areas have an opaque, matte-like finish, where the brush strokes are not visible anymore and overall the painting looks more vibrant and crisp. Watercolor is your CPU and gouache is your GPU.

Q: In looking at these two supercomputing titans, how do you compare Frontier’s speed to Summit’s? 

Chandrasekaran: Chatting with my collaborator, Dr. Alexander Debus at HZDR, helped me make some observations — simulations like ours with PIConGPU that would take two months on Summit might end up taking one week on Frontier. This also means we would now be able to run several 10-million time-step simulations on Frontier (each time step would take ~50 milliseconds). Time-step simulations allow us to analyze the operation of the computer’s power system from hour-to-hour intervals, right down to thousandths of a second.

Q: Who are your collaborators and what is it like coalescing an international team?

Chandrasekaran: My collaborators are from ORNL, HZDR, CASUS, and the Georgia Institute of Technology. I have not met half of my team in person, yet it feels like we have been working together for years. We are now a small family. Please see this webpage for details.

Once every few months, we make sure to discuss the team’s, as well the project’s common vision and goals to ensure the short- and long-term goals align well with CAAR deliverables. This is particularly important for an international team like ours. Most of the conversations and discussions are hashed out over email/Slack prior to scheduling a group phone call, given that there are more than a few hours of time difference between the U.S. and Germany.

Q: What is the most exciting/rewarding aspect of the project for you?

Chandrasekaran: I believe it is the interdisciplinary component of this project. It is intriguing to think about applying computer science concepts to a real-world scientific application. I am also thrilled that our close collaborations have led to this project being funded by Dr. Michael Bussmann (CASUS at HZDR, Germany). This is my first internationally funded collaborative project.

Q: What are the areas where Frontier is poised to have the greatest impact? Do you expect Frontier to help advance future virus research, for example?

Chandrasekaran: I believe so, especially when we are in the phase of integrating high-performance computing (HPC), artificial intelligence (AI) and data science. Large-scale (and fast) simulations that couldn’t be imagined just a few years ago are now going to become possible with the massive compute resources that Frontier is going to offer. Not just virus research, but such compute capabilities are of paramount importance to studies like finding a cure for Alzheimer’s disease or studying climate change.

Q: Has COVID-19 impacted your work? 

Chandrasekaran: It definitely has. Since March, life has been different. I miss running down to my Computational and Research Programming Lab and having a face-to-face conversation with my students. We all miss our in-person group meetings. The pandemic has taught us what “not” to take for granted. Having said that, no matter how exhausting day-to-day life has become, I am still grateful to Zoom, Slack and other modes of communication that help me stay in touch with my research group. We are clearly re-inventing newer ways to communicate and do research.

Q: How are UD students contributing to the effort?

Chandrasekaran: My Ph.D. student, Matt Leinhauser, has been working on this project since its inception. With mentorship from myself and my CAAR team (especially Rene Widera, Sergei Bastrakov and former CAAR liaison Ronnie Chatterjee), Matt has been able to put together two technical documents on profilers — these are tools that identify portions in the computer program that take the most computation time.  We have so far used NVIDIA’s nvprof and Nsight profiler tools to dive deeper into the code. HZDR also invited Matt to spend last winter (January 2020) with them, which was a rewarding opportunity when he was still in his first year of the Ph.D. program.

Q: What’s on the horizon?

Chandrasekaran: With support from the Frontier Center of Excellence team, we will be marching forward to port PIConGPU on the early access systems and preparing the application for Frontier, which is being built as we speak. As next steps, we will be working on optimizing PIConGPU on the early access systems and speeding up the simulations even further.

This simulation on Oak Ridge National Laboratory’s Summit supercomputer demonstrates the principle of Laser Wakefield Electron Acceleration, where a laser pulse is introduced to form an electron plasma wave. Shown at left: Electric, magnetic and current density fields are colored in red, yellow and green, respectively. At right: The density of electrons being accelerated. Courtesy of Benjamin Hernandez (OLCF) and Richard Pausch and Felix Meyer (HZDR).

Source: Tracey Bryant, University of Delaware (link)

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

With New Owner and New Roadmap, an Independent Omni-Path Is Staging a Comeback

July 23, 2021

Put on a shelf by Intel in 2019, Omni-Path faced a uncertain future, but under new custodian Cornelis Networks, OmniPath is looking to make a comeback as an independent high-performance interconnect solution. A "significant refresh" – called Omni-Path Express – is coming later this year according to the company. Cornelis Networks formed last September as a spinout of Intel's Omni-Path division. Read more…

PEARC21 Panel Reviews Eight New NSF-Funded HPC Systems Debuting in 2021

July 23, 2021

Over the past few years, the NSF has funded a number of HPC systems to further supply the open research community with computational resources to meet that community’s changing and expanding needs. A review of these systems at the PEARC21 conference (July 19-22) highlighted... Read more…

Chameleon’s HPC Testbed Sharpens Its Edge, Presses ‘Replay’

July 22, 2021

“One way of saying what I do for a living is to say that I develop scientific instruments,” said Kate Keahey, a senior fellow at the University of Chicago and a computer scientist at Argonne National Laboratory, as s Read more…

PEARC21 Plenary Session: AI for Innovative Social Work

July 21, 2021

AI analysis of social media poses a double-edged sword for social work and addressing the needs of at-risk youths, said Desmond Upton Patton, senior associate dean, Innovation and Academic Affairs, Columbia University. S Read more…

Summer Reading: “High-Performance Computing Is at an Inflection Point”

July 21, 2021

At last month’s 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART), a group of researchers led by Martin Schulz of the Leibniz Supercomputing Center (Munich) presented a “position paper” in which they argue HPC architectural landscape... Read more…

AWS Solution Channel

Accelerate innovation in healthcare and life sciences with AWS HPC

With Amazon Web Services, researchers can access purpose-built HPC tools and services along with scientific and technical expertise to accelerate the pace of discovery. Whether you are sequencing the human genome, using AI/ML for disease detection or running molecular dynamics simulations to develop lifesaving drugs, AWS has the infrastructure you need to run your HPC workloads. Read more…

PEARC21 Panel: Wafer-Scale-Engine Technology Accelerates Machine Learning, HPC

July 21, 2021

Early use of Cerebras’ CS-1 server and wafer-scale engine (WSE) has demonstrated promising acceleration of machine-learning algorithms, according to participants in the Scientific Research Enabled by CS-1 Systems panel Read more…

With New Owner and New Roadmap, an Independent Omni-Path Is Staging a Comeback

July 23, 2021

Put on a shelf by Intel in 2019, Omni-Path faced a uncertain future, but under new custodian Cornelis Networks, OmniPath is looking to make a comeback as an independent high-performance interconnect solution. A "significant refresh" – called Omni-Path Express – is coming later this year according to the company. Cornelis Networks formed last September as a spinout of Intel's Omni-Path division. Read more…

Chameleon’s HPC Testbed Sharpens Its Edge, Presses ‘Replay’

July 22, 2021

“One way of saying what I do for a living is to say that I develop scientific instruments,” said Kate Keahey, a senior fellow at the University of Chicago a Read more…

Summer Reading: “High-Performance Computing Is at an Inflection Point”

July 21, 2021

At last month’s 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART), a group of researchers led by Martin Schulz of the Leibniz Supercomputing Center (Munich) presented a “position paper” in which they argue HPC architectural landscape... Read more…

PEARC21 Panel: Wafer-Scale-Engine Technology Accelerates Machine Learning, HPC

July 21, 2021

Early use of Cerebras’ CS-1 server and wafer-scale engine (WSE) has demonstrated promising acceleration of machine-learning algorithms, according to participa Read more…

15 Years Later, the Green500 Continues Its Push for Energy Efficiency as a First-Order Concern in HPC

July 15, 2021

The Green500 list, which ranks the most energy-efficient supercomputers in the world, has virtually always faced an uphill battle. As Wu Feng – custodian of the Green500 list and an associate professor at Virginia Tech – tells it, “noone" cared about energy efficiency in the early 2000s, when the seeds... Read more…

Frontier to Meet 20MW Exascale Power Target Set by DARPA in 2008

July 14, 2021

After more than a decade of planning, the United States’ first exascale computer, Frontier, is set to arrive at Oak Ridge National Laboratory (ORNL) later this year. Crossing this “1,000x” horizon required overcoming four major challenges: power demand, reliability, extreme parallelism and data movement. Read more…

Quantum Roundup: IBM, Rigetti, Phasecraft, Oxford QC, China, and More

July 13, 2021

IBM yesterday announced a proof for a quantum ML algorithm. A week ago, it unveiled a new topology for its quantum processors. Last Friday, the Technical Univer Read more…

ExaWind Prepares for New Architectures, Bigger Simulations

July 10, 2021

The ExaWind project describes itself in terms of terms like wake formation, turbine-turbine interaction and blade-boundary-layer dynamics, but the pitch to the Read more…

AMD Chipmaker TSMC to Use AMD Chips for Chipmaking

May 8, 2021

TSMC has tapped AMD to support its major manufacturing and R&D workloads. AMD will provide its Epyc Rome 7702P CPUs – with 64 cores operating at a base cl Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

Berkeley Lab Debuts Perlmutter, World’s Fastest AI Supercomputer

May 27, 2021

A ribbon-cutting ceremony held virtually at Berkeley Lab's National Energy Research Scientific Computing Center (NERSC) today marked the official launch of Perlmutter – aka NERSC-9 – the GPU-accelerated supercomputer built by HPE in partnership with Nvidia and AMD. Read more…

Ahead of ‘Dojo,’ Tesla Reveals Its Massive Precursor Supercomputer

June 22, 2021

In spring 2019, Tesla made cryptic reference to a project called Dojo, a “super-powerful training computer” for video data processing. Then, in summer 2020, Tesla CEO Elon Musk tweeted: “Tesla is developing a [neural network] training computer called Dojo to process truly vast amounts of video data. It’s a beast! … A truly useful exaflop at de facto FP32.” Read more…

Google Launches TPU v4 AI Chips

May 20, 2021

Google CEO Sundar Pichai spoke for only one minute and 42 seconds about the company’s latest TPU v4 Tensor Processing Units during his keynote at the Google I Read more…

CentOS Replacement Rocky Linux Is Now in GA and Under Independent Control

June 21, 2021

The Rocky Enterprise Software Foundation (RESF) is announcing the general availability of Rocky Linux, release 8.4, designed as a drop-in replacement for the soon-to-be discontinued CentOS. The GA release is launching six-and-a-half months after Red Hat deprecated its support for the widely popular, free CentOS server operating system. The Rocky Linux development effort... Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

Iran Gains HPC Capabilities with Launch of ‘Simorgh’ Supercomputer

May 18, 2021

Iran is said to be developing domestic supercomputing technology to advance the processing of scientific, economic, political and military data, and to strengthen the nation’s position in the age of AI and big data. On Sunday, Iran unveiled the Simorgh supercomputer, which will deliver.... Read more…

Leading Solution Providers

Contributors

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

GTC21: Nvidia Launches cuQuantum; Dips a Toe in Quantum Computing

April 13, 2021

Yesterday Nvidia officially dipped a toe into quantum computing with the launch of cuQuantum SDK, a development platform for simulating quantum circuits on GPU-accelerated systems. As Nvidia CEO Jensen Huang emphasized in his keynote, Nvidia doesn’t plan to build... Read more…

Microsoft to Provide World’s Most Powerful Weather & Climate Supercomputer for UK’s Met Office

April 22, 2021

More than 14 months ago, the UK government announced plans to invest £1.2 billion ($1.56 billion) into weather and climate supercomputing, including procuremen Read more…

Q&A with Jim Keller, CTO of Tenstorrent, and an HPCwire Person to Watch in 2021

April 22, 2021

As part of our HPCwire Person to Watch series, we are happy to present our interview with Jim Keller, president and chief technology officer of Tenstorrent. One of the top chip architects of our time, Keller has had an impactful career. Read more…

Quantum Roundup: IBM, Rigetti, Phasecraft, Oxford QC, China, and More

July 13, 2021

IBM yesterday announced a proof for a quantum ML algorithm. A week ago, it unveiled a new topology for its quantum processors. Last Friday, the Technical Univer Read more…

Senate Debate on Bill to Remake NSF – the Endless Frontier Act – Begins

May 18, 2021

The U.S. Senate today opened floor debate on the Endless Frontier Act which seeks to remake and expand the National Science Foundation by creating a technology Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire