“Before, you would have to make decisions on the next day based on the previous day’s data. But now we’re getting this stuff in real-time streaming and we’re able to make decisions based on the last 5 minutes of data,” Ceballos said.

Scott Simmerman, an HPC software engineer in the User Assistance and Outreach group who is currently integrating the available Kafka data feeds into RATS Report, said the shift to Kafka will also help gather job data from special sources, such as the Compute and Data Environment for Science (CADES), an integrated computing infrastructure.

“Currently the process for getting job logs from CADES has lots of moving parts and can be a pain to set up and maintain,” Simmerman said. “With Kafka, it’s just a matter of them publishing their job logs to Kafka and then us consuming them—with no files moving around different file systems.”

Once more data sources are connected to the Kafka platform, the CAM team plans to focus on new ways to process and visualize the data, making it easily accessible to NCCS personnel. From there, intelligent applications can be built to make automated decisions based on this data, improving operational efficiency.

The first phase of any intelligent system is gathering data, so once we gather all this data, we can start putting it into applications that make decisions based on that data over time,” Prout said. “That’s the ultimate goal—to make it an intelligent facility.”

There will be a reoccurring Kafka User Group meeting on the second Tuesday of every month starting January 14, 2020, from 2-2:30 p.m. in Building 5600, Room E202. OLCF personnel can email [email protected].ornl.gov to receive a calendar invite, though it isn’t necessary to attend.

About Oak Ridge National Laboratory

UT-Battelle LLC manages Oak Ridge National Laboratory for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.


Source: Coury Turczyn, Oak Ridge Leadership Computing Facility (OLCF)