Los Alamos Pursues Efficient Computing with Cray, Marvell and Arm

LOS ALAMOS, N.M., Nov. 7, 2018—In a drive to significantly boost usable operations per watt, per dollar and per development hour for extreme-scale computing, Los Alamos National Laboratory is running classified simulation codes in support of the National Nuclear Security Administration’s Stockpile Stewardship Program on the new Cray XC50 system with Marvell ThunderX2 processors.

The collaboration with Cray Inc., funded by the NNSA’s Advanced Simulation and Computing Program, integrates the Marvell ThunderX2 processors with Cray’s proven networking and software ecosystem in the Laboratory’s secure computing environment.

“For too long, the community has been driving for peak operations per watt, while mission-critical, national security applications have extracted fewer usable operations per peak FLOPS (floating-point operations per second) even after spending enormous time and energy revamping applications to mate to machines on this peak FLOPS quest,” said Gary Grider, leader of the High Performance Computing Division at Los Alamos. “Our focus is on fostering efforts and systems that enable efficient mission-focused computing at extreme scale. The simulations run at Los Alamos use highly irregular data structures that require high-fidelity, multi-physics applications that utilize peta-scale datasets and workflows for the security of the nation.”

The new Marvell ThunderX2 processors, based on the Arm v8-A architecture, are designed to perform calculations more efficiently and with greater memory bandwidth. A workload-optimized processor enables more efficient calculations per watt as well as more concurrent work in the processor. The improved memory bandwidth is crucial to the Los Alamos mission workload of extreme-scale physics simulations. Los Alamos and Marvell are collaborating on future proposed enhancements to the overall processor and memory subsystem that will build on existing processor capabilities.

Los Alamos is working with Cray to ensure that applications can be easily ported to this new architecture and achieve a higher percentage of peak performance than past systems. Additionally, the Laboratory and Cray will hone the necessary software including compilers, operating systems, networking and storage, and orchestration services to ensure a quick transition to full secure production for this architecture.

“As always, we’re pleased to partner with the Los Alamos team in support of their important mission to solve national security challenges through scientific excellence,” said Fred Kohout, senior vice president and CMO at Cray. “Being able to provide the performance and scalability of the Cray XC50, our end-to-end software environment and Cray’s compilers and tools that include unique optimizations for Arm processors offers the Laboratory the ability to quickly and easily solve production level challenges and ensure their scientists get the results they need faster in a robust, operational supercomputer.”

Cray’s deployment of the Marvell Arm-based technology leverages Cray’s long experience in providing working software environments for large-scale supercomputing. Although there are other instances of systems either developing codes or investigating elements of the HPC software stack using Arm systems, Los Alamos is the first to deploy and use the Marvell Arm-based processors in direct support of classified high performance computing for national security mission work. These encouraging results helped inspire the launch of the Efficient Mission-Centric Computing Consortium (EMC3) with HPC technology providers and HPC consumers to support and accelerate this move toward higher efficiencies for computing and environments for extreme scale mission-centric computing.

“The deployment of ThunderX2 processors at Los Alamos National Laboratory to run simulation codes for extreme-scale national-security mission work is a significant milestone for Arm-based server technology,” said Gopal Hegde, vice president and general manager, Server Processor Business Unit at Marvell Semiconductor, Inc. “Los Alamos will benefit from the ThunderX2 processor’s key design and optimization principles. These design features will have direct impact on the efficiency and performance of some of the most challenging applications and use cases in high performance computing.”

High performance computers play a pivotal role in Los Alamos’ mission of maintaining the nation’s nuclear stockpile and understanding the complicated physics of nuclear deterrence. Along with Los Alamos, leaders in other industries, such as energy and film, will benefit from the more efficient technologies being fostered. The oil industry uses supercomputers to simulate underground reservoirs to guide investments of hundreds of millions of dollars in developing and processing oil fields. The film industry relies heavily on HPC systems to create and render detailed animations in blockbuster movies.

About Cray Inc.

Cray Inc. combines computation and creativity so visionaries can keep asking questions that challenge the limits of possibility. Drawing on more than 45 years of experience, Cray develops the world’s most advanced supercomputers, pushing the boundaries of performance, efficiency and scalability. Cray continues to innovate today at the convergence of data and discovery, offering a comprehensive portfolio of supercomputers, high performance storage, data analytics and artificial intelligence solutions.

About Marvell

Marvell first revolutionized the digital storage industry by moving information at speeds never thought possible. Today, that same breakthrough innovation remains at the heart of the company’s storage, processing, networking, security and connectivity solutions. With leading intellectual property and deep system-level knowledge, Marvell’s semiconductor solutions continue to transform the enterprise, cloud, automotive, industrial, and consumer markets.

Source: LANL