Even with the unprecedented computing power currently at our disposal, a tenfold increase in supercomputing capacity is needed to solve many of today’s scientific grand challenge problems.
Starting in 2019, TACC was formally invited by the National Science Foundation (NSF) to develop a plan for a Leadership-Class Computing Facility (LCCF) — a center for cyberinfrastructure, including hardware, software, storage, people, and programs. The facility would begin operations around 2025 and support academic researchers in the U.S. on a decadal scale.
Its first mission: deploy a system 10 times more capable than Frontera.
The project is being planned as part of the NSF’s Major Research Equipment and Facilities (MREFC) process, which funds very large-scale scientific instruments and their facilities. It recently progressed from the Preliminary Design phase to the Conceptual Design phase.
“The ten-year initial operational period for MREFC projects will provide the nation’s scientists and engineers with a long-term partner, enabling collaborations not possible with shorter awards,” according to John West of TACC, one of the principals on the planning effort. “It will change the way scientists integrate computation into their research.”
The primary constituencies for the facility will be current large-scale simulation users; the NSF Large Facilities Community, including sites like the Laser Interferometer Gravitational-Wave Observatory (LIGO) and the Large Hadron Collider (LHC); experiments in edge computing with large networks of sensors for future smart cities; new users in AI and machine learning; cyberinfrastructure researchers; and researchers at other national labs.
TACC is designing the LCCF and its premier system at a moment when HPC’s possible directions are expanding. Processors and system architectures are diversifying, and the workloads that centers support — once reliably simulation- and modeling-driven — are expanding to include machine and deep learning, data assimilation, new forms of data and visual analysis, and urgent computing. Meanwhile, centers continue to face an insatiable demand for compute time from all fields of science.
“High performance hardware itself is only a part of the solution,” West said. “Grand Challenge problems also require breakthroughs in algorithms, computational science, data management and visualization, software engineering, scientific workflows, and system architecture as well as a community of expertise built around the technological capabilities in these areas to ensure that the technologies, hardware and software, can be translated into practice.”
If TACC is addressing the broad needs of the computational science in its LCCF designs, the center is also working to broaden the pipeline of computing professionals and increase support for publicly-funded computing efforts by communicating the importance of computing to the public.
TACC plans to incorporate a computational science museum and learning center into its LCCF designs — a place where students, leaders, and local residents can learn about computational thinking and applications of computational science in daily life.
Said Dan Stanzione, TACC executive director: “Our hope is that the Leadership-Class Computing Facility becomes a place where critical, life-saving, world-changing science can be conducted, and where those successes can be communicated to the next-generation of innovators.”
Leaders of 24 U.S. research groups at the forefront of high performance computing participated in a workshop hosted by TACC and its partners at NSF and UT Austin’s Oden Institute for Computational Engineering and Sciences in January 2020.
Discussions and input from the workshop inform a report entitled, “Future Directions in Extreme Scale Computing for Scientific Grand Challenges.”
The report, published earlier this year, identifies a number of scientific grand challenge problems that will drive high performance computing (HPC) over the next decade, including what kind of research and programs should be prioritized.
Requirements gathering will continue over the course of the LCCF design period through additional workshops, community events, and other opportunities for input from the community.
[The report is available for download; a high-level synopsis can also be found online. The team welcomes comments and input on additional grand challenges from the scientific community at [email protected].]
About the Author
Aaron Dubrow is a Science And Technology Writer with the Communications, Media & Design Group at the Texas Advanced Computing Center.
Header image: proposed LCCF datacenter expansion at the J.J. Pickle Research Center, Austin, Texas.