As HPC datacenters scale up, improving efficiency is crucial to avoiding correspondingly large energy use (and the ensuing high costs and large carbon footprints). Now, a team at Lawrence Berkeley National Laboratory’s National Energy Scientific Research Computing Center (NERSC) has made large strides toward optimizing HPC datacenters through a collaborative endeavor called the Efficiency Optimization Project.

A few years ago, Berkeley Lab researchers set out to improve the design and optimization of its new Shyh Wang Hall and its datacenter. Berkeley Lab was targeting strong sustainability metrics for the project, including a high PUE (power utilization effectiveness, the ratio of electricity use to electricity use for computing) of 1.1 – allowing all non-computing operations to comprise just 10 percent of the building’s energy use. Following the effort to meet Shyh Wang Hall’s sustainability goals, the Efficiency Optimization Project was formed by members of three of Berkeley Lab’s divisions (NERSC, the Energy Technologies Area (ETA) and Sustainable Berkeley Lab).
“There are strong ties between NERSC, the Energy Sciences Network (ESnet), and the research community in ETA,” said Rich Brown, a research scientist in ETA’s Building Technologies and Urban Systems Division, in an interview with NERSC’s Kathy Kincade. “Berkeley Lab facilities have traditionally been a place where we can apply some of these findings and work with facilities managers to see how these ideas work in a large institution.”
In essence, the Efficiency Optimization Project aims to use operational data analytics to improve datacenters’ efficiency – starting with their cooling systems. The team collected plant metrics and analyzed them using SkySpark, a building data tool that they integrated with NERSC’s OMNI data collection system, which integrates HPC systems data with facility data and rack-level sensor data.
Using this medley of data sources, the team was able to use operational data analytics to continuously adjust energy efficiency measures. One of the most important factors was being able to view the (typically annual) power utilization effectiveness metric in near-real time. “When you have that, you start being able to ask new kinds of questions about how to optimize the systems,” said John Elliot, chief sustainability officer at Berkeley Lab.
While much of the project has focused on tweaking controls, other improvements have come from small capital improvements (such as installing a heat exchanger) or more complex interplays between facilities. One interesting application involved establishing active communications between the internal fans on NERSC’s Cori supercomputer and the cooling plant, allowing the load to shift between internal and cooling tower-provided cooling depending on the weather outdoors.
“Up to this point, facilities and compute systems in data centers have been highly siloed,” said Norm Bourassa, a building systems and energy engineer in NERSC’s Building Infrastructure Group. “But we’re moving into the exascale era where that boundary has to be broken. That’s been one of the most interesting things about this project: the opportunity to merge the compute systems and the facility plant silos into a unified holistic system and thus advance the state of computing science.”
Header image: Shyh Wang Hall. Image courtesy of Berkeley Lab.
To read more about the project, visit the article by NERSC’s Kathy Kincade here.