Summit – the world’s fastest publicly-ranked supercomputer – now has real-time streaming analytics. At the 2019 HPC User Forum at Argonne National Laboratory, Arno Kolster (principal and co-founder of HPC consultancy Providentia Worldwide) took the stage to explain how it happened – and what it means for the future.
The need for a smarter supercomputer
Summit launched at Oak Ridge National Laboratory (ORNL) in the second half of 2018. As of the June 2019 Top500 list, it still held the top spot among the world’s supercomputers, its 2.41 million cores delivering 148.6 Linpack petaflops. That also means a correspondingly massive power draw; Summit’s power consumption is rated at around 13 megawatts, equivalent to the energy draw of over 10,000 homes. That power produces an enormous amount of heat, requiring regular operation of power-hungry water chillers for Summit’s cooling system.
For ORNL, that means that a huge priority is reducing power consumption (and its costs) wherever possible. But a major obstacle remained: there was no mechanism in place that understood Summit’s second-to-second operations at a granular enough level to effectively optimize them.
This led Jim Rogers, director of computing and facilities at ORNL, to seek out Kolster. Rogers and Kolster, who knew each other from a partnership some six years ago, reconnected at a conference in 2017, where Kolster was speaking about streaming analytics.
“At the end of the talk, Jim pulled me aside and said, ‘Hey, can you help us do streaming analytics on Summit? Because I’ve got a problem: I’ve got 4,600 nodes, all streaming data off them, and I have no idea what to do with them,’” Kolster recalled. “And I said, ‘Yeah, we can help you with that.’”
The goal? To have Summit’s immense data streaming directly off the nodes in real-time using a resilient system that could be scaled up without proportional staff increases.
Summiting a mountain of data
Kolster broke down the magnitude of the data at hand: first and foremost, there were 4,608 nodes, each with 99 metrics to capture per second – most importantly, power to the fans, the node and individual components within the node, such as individual CPU cores, GPU cores, DIMMs and HBMs. Outside the node, there was data from the job scheduler, polled every ten seconds; weather data from the National Oceanic and Atmospheric Administration (NOAA) once an hour; and continuous water flow data from the chillers.
All in all, about 460,000 metrics per second – with an eye toward expansion.
This didn’t worry Providentia – the founders had handled upwards of seven million events when they were at PayPal – and so, about a year ago, Providentia set about the long process of building streaming analytics for Summit. “The majority of the time was spent on [addressing] the legal hurdles for small business to work with the government. That took probably a couple of months,” Kolster said in an interview with HPCwire. “The other long-term time was spent on crafting the statement of work in a way that we were both happy with it.” Providentia also had to navigate around Summit’s tight security and scheduling. In the end, Kolster said, it was about three months of development over an eight-month period – all done remotely on three small nodes.
Providentia began with a Kafka-based event message bus linked into the data sources. It added data persistence tools: Prometheus as a time series database and Elasticsearch for log metrics and understanding, among others. Docker was used to containerize and scale, and Spark streaming was added for on-the-wire data analytics. Finally, Grafana and Seaborn came into play for data visualization. (“Young people like to play with this stuff,” Kolster said of the long list of technologies, “so it’s a way of getting some of the younger people involved with HPC.”)
And the result? “It’s pretty spectacular,” Kolster said. Near-instant, agnostic data that could be custom-formatted; overlapping metrics with real-time visualizations. Kolster pulled up an example of one of the visualizations, thousands of glittering green cells fading in and out as Summit’s power-per-job fluctuated across its nodes.
A new paradigm
“It’s a new paradigm,” Kolster said. “There’s no more looking at databases for data. There’s no more waiting until tomorrow to look at the data. It’s basically real-time data. What you see right now is what’s happening right now.”
“The largest supercomputer in the world is now being micromanaged by microservices – a cloud thing,” he continued.
For now, the capabilities of the infrastructure are primarily the instant analytics and visualization that now help system operators to manually adjust Summit’s cooling and optimize job scheduling. Of course, it already has a couple of neat tricks up its sleeve – notably, the ability to alert operators if the temperature in a specific area goes up a certain amount. Kolster also hopes that the new infrastructure will help clients ask (and answer) crucial questions, such as “why is my job spending more time on the CPU than on the GPU?” or “why does my job consume more power than someone else’s job?”
Still, Kolster seems to have his heart truly set on “phase two” of the project, which (for now) remains a speculative endeavor. Phase two would involve leveraging the massive data stream for robust predictive analytics that would, for example, allow Summit to automatically schedule jobs to cooler areas of the cluster. “You could actually have the job scheduler be smart enough to schedule jobs according to their power consumption, based on historical metrics,” Kolster said. “And that’s very powerful, because that’s something that can be done right now.”
“That’s basically where things are heading,” he continued. “You hear about predictive analytics, prescriptive analytics – the basic problem right now is that everyone’s reacting to things instead of being proactive about it. And so we’ve always been more, you know, ‘You’ve got the machinery, you’ve got the computers, you have the analytics – let’s be more proactive about how things are working.’”
Whether or not Providentia is invited back for phase two of Summit’s analytics infrastructure, Kolster is happy with the results. “We’d love to finish off the second phase of the Oak Ridge project, because we have some really interesting things around AI and machine learning that we think we can bring to bear there,” he said. “But we know full well that they’ve also got some really smart people there that might want to delve into those areas on their own. We’ve built the ‘highway,’ if you will, for them to move cars and trucks around, and now they can do whatever they want with the on-ramps and off-ramps.”
Musing on future applications of Providentia’s approach to Summit, Kolster said that he would prefer to showcase the “full vision” rather than arriving in media res. “I would rather do it up front and be part of the stack that goes in instead of doing it afterwards and retrofitting it in,” he said. He mentioned that Providentia is talking to two different verticals where the model can be used – and, of course, he said that they would love to work with Frontier, which is expected to be the world’s most powerful system when it launches in 2021.
“It’s not because it’s a new thing,” Kolster said of organizations’ interest in this approach. “It’s just that people … don’t understand – moving messages around by the millions of messages a second, they don’t understand that this can be accomplished. … And then it opens up a whole new discussion as to possibilities they never knew existed.”