HPC is entering a new era: exascale is (somewhat) officially here, but Moore’s law is ending. Power consumption and other sustainability concerns loom over the enormous systems and chips of this new epoch, for both cost and compliance reasons. Reconciling the need to continue the supercomputer scale-up while reducing HPC’s environmental impacts has many in the field asking: in HPC, can fast be green, or is it inherently a contradiction?
At SC21, a group of experts asked just that question at a birds-of-a-feather session titled “Can fast be green? Opportunities and challenges for Europe when making HPC sustainable”. The session was moderated by Maike Gilliot, an HPC project manager for the European Technology Platform for HPC (ETP4HPC) and Andreas Wierse, managing director of HPC firm Sicos BW. There were two featured speakers: Franz-Josef Pfreundt, an HPC manager at the Fraunhofer Institute, and Daniele Cesarini, a senior engineer at Cineca. Both speakers are also members of the steering board for ETP4HPC, which is a private organization aimed at promoting European HPC development.
“Why are we, in Europe, looking at this topic?” Gilliot opened. “And the truth is that, of course, all of us, we should engage in efforts for reducing climate change and for watching and controlling the use of energy – but beyond this, on a European level, we will also have constraints forcing us to look into these topics.” Indeed, beyond Europe’s longstanding – and tightening – restrictions on energy production and carbon intensity, Europe is soon implementing legal requirements around the environmental total cost of ownership.
Freundt presented first. “Although we have increased the computing power a lot [between 2010 and 2018], the energy consumption [of the sector] more or less stayed flat,” he said, attributing that contradiction to improvements in cooling optimization. “But the predictions are not so good,” he continued. “There is an explosion in front of us” – and, he said, a lot of work would be necessary to keep energy consumption flat in the face of rapidly expanding and multiplying datacenters.
To that end, Freundt said, there were a number of promising signs: big tech companies like Apple and Google investing heavily in renewable energy, as well as initiatives like the Green500 and the Energy Efficient HPC Working Group. Still, he said, computing power was growing exponentially – and energy reductions weren’t enough to keep up.
“The question is: what can we do?” he said. And for Freundt, the answer comes down to reframing the conversation.
“I am very much in favor of renaming it ‘high-efficiency computing’ instead of ‘high-performance computing,’” he said, “since we really are the community that knows how to design fast algorithms, we know how to make really efficient implementations, how to efficiently parallelize algorithms – which is always necessary – and how to choose the right hardware to do it.”
For over a decade, Freundt said, he’d been wanting to work to design more efficient hardware with these principles in mind – and, he said, the European Processor Initiative (a project to develop homebuilt European processors) had enabled that. By way of illustration, Freundt highlighted a domain-specific accelerator – the stencil/tensor accelerator (STX) – being developed under the EPI. The STX, he said, employed a hardware-software codesign to maximize performance while prioritizing portability. The key, Freundt said, was looking not at flops per watt, but at application performance per watt. “Software and algorithms are the key factors in green computing,” he said – and compilers and hardware architecture were there to make it easier.
Cesarini broadened the conversation, moving from energy efficiency to the system lifecycle for supercomputers. “If we look at the lifecycle of a supercomputing system from manufacturing to the decommissioning of a system, an HPC center [like Cineca] can have a huge impact on part of this lifecycle,” he said.
“We can start from the procurement,” Cesarini continued, explaining that Cineca was careful in procurement to select energy-efficient technologies. From there, he said, Cineca put a strong emphasis on efficient cooling systems – particularly warm-water cooling – to achieve a PUE “very close to 1.0.” Monitoring was also crucial: “We need to monitor, to collect, and to analyze the utilization of the system – but not only the system, the entire facility,” he said, in order to optimize energy use and cooling.
But even after achieving excellent PUE and optimizing as much as possible, he explained, there remained a problem: Cineca needed to replace its systems every four to five years. In order to reduce the environmental total cost of ownership, Cesarini said, they had little control over the manufacturing processes for their components – so they needed to increase the lifetimes of the systems.
With Cineca’s time of ownership somewhat set in stone, the center has instead turned to parsing out its systems into smaller subsystems upon decommissioning and delivering them to smaller tier one and two national datacenters. These centers, he said, often had a lower priority on energy use and older components, leading to efficiency improvements for those centers and longer lifetimes for the subsystems.
Wierse said that he and others were working on a white paper on green HPC on behalf of ETP4HPC, scheduled for release in 2022. “The emphasis, at least in most of what we have seen, is currently on performance per watt,” he said – but CO2 footprint impacts, he continued, extended beyond the operation phase, as Cesarini had discussed. Wierse suggested that, as Moore’s law came to an end, the emphasis on parallelism could be an opportunity to extend the lifetimes of HPC systems and components.
Pfreundt highlighted other opportunities, such as reusing waste heat, that have been gaining steam (so to speak), particularly with high-profile systems like LUMI taking advantage of those techniques. Wierse, similarly, mentioned companies that were working to site HPC systems directly at renewable energy generation facilities, where they could take advantage of green energy without transmission hurdles.
And, again, much of the discussion circled back to the importance of software development as gains from hardware improvements began to diminish. “My suggestion to the software developers is to focus on the performance,” Cesarini said. Pfreundt echoed those sentiments, saying that users needed to optimize for the efficiency gains to be realized. “If you don’t push them, they run single-core on full nodes,” he said. “And we don’t want that.”