Berkeley Lab, Intel to Collaborate in Updating Scientific Codes for Manycore Architectures

June 18 — Lawrence Berkeley National Laboratory has been named an Intel Parallel Computing Center (IPCC), a collaboration with Intel aimed at adapting existing scientific applications to run on future supercomputers built with manycore processors. Such supercomputers will potentially have millions of processor cores, but today’s applications aren’t designed to take advantage of this architecture.

Most scientific applications, such as those used to study climate change, combustion, astrophysics, materials, etc., are designed to run on parallel systems, meaning that the problem is divided into smaller tasks so more of the calculations can be done simultaneously to reduce the time to solution for the scientists. With the growing use of manycore processors, such as Intel’s Xeon and Xeon Phi processors which can which can have more than 60 cores in each processor, applications will need to have even more parallelism. Unless applications are modernized, they will not be able to take advantage of the greater computing performance promised by manycore processors.

The Berkeley Lab IPCC will be led by Nick Wright of the National Energy Research Scientific Computing Center (NERSC), and Bert de Jong and Hans Johansen of the Computational Research Division (CRD).

“Although manycore processors will significantly increase supercomputing performance, that’s only part of the equation,” said Wright, who leads NERSC’s Advanced Technologies Group. “To fully capitalize on this capability, we need to modernize the applications our user community uses to advance scientific discovery. Intel Parallel Computing Centers such as ours are helping to support the community to attack this problem.”

Optimizing applications for manycore is important for NERSC, which announced in April that its next-generation supercomputer will be a Cray XC supercomputer using Intel’s next-generation Xeon Phi processor, which will have more than 60 cores. NERSC is working with its 5,000 users to help them adapt their codes to the new system, which will is expected to be delivered in 2016.

The Berkeley Lab IPCC will focus on increasing the parallelism of two widely used applications: NWChem and CAM5, the Community Atmospheric Model. NWChem is a leading application for computational chemistry and CAM5, part of the Community Earth System Model, is widely used for studying global climate. Modernizing these codes to run on manycore architecture will enable the scientific community to pursue new frontiers in the fields of chemistry, materials and climate research. Because both NWChem and CAM5 are open source applications, any improvements made to them will be shared with the broader user community, maximizing the benefits of the project.

“Enabling NWChem to harness the full power of manycore processors allows our computational chemistry and materials community to accelerate scientific discovery, tackling more complex scientific problems and reducing the time researchers have to wait for simulations to complete,” says de Jong, who leads CRD’s Scientific Computing Group and is a lead developer of the NWChem software. “Advances made by our IPCC will be shared with the developer community, including lessons learned and making our code available as open source.”

The goal is to deliver enhanced versions of NWChem and CAM5 that at least double their overall performance on manycore machines of today. The research and development will be focused upon implementing greater amounts of parallelism in the codes, starting with simple modifications such as adding or modifying existing components and going as far as exploring new algorithmic approaches that can better exploit manycore architectures.

“The open-source scientific community truly depends on CAM components running effectively at NERSC. And climate scientists have always been early adopters of cutting-edge architectures,” says Johansen, a computational science researcher at Berkeley Lab. “With more performance and more parallelism, scientists can accelerate their simulations and more accurately represent atmospheric dynamics. This collaboration with Intel will help climate science developers leverage NERSC’s and Intel’s network of resources and manycore expertise.”

Berkeley Lab is an ideal collaborator for this project. The lab is home to NERSC, the U.S. Department of Energy’s most scientifically productive supercomputing center with more than 5,000 users running about 700 different applications. CRD is home to fundamental research programs in computer science, applied mathematics, and computational science where researchers investigate future directions in scientific computing and work to develop new tools and technologies to fully exploit the increasing power of supercomputers.

According to Wright, NERSC staff will conduct extensive outreach and training to share what they have learned with NERSC’s broader user community. This will supplement the training and outreach efforts NERSC is already doing to support its users on its current flagship supercomputer “Edison,” a Cray XC30 supercomputer that uses Intel Xeon “Ivybridge” processors. Additionally, the work will be part of the NERSC’s Application Readiness program to help prepare users for the expected 2016 delivery of “Cori,” a Cray XC supercomputer architected with Intel’s next-generation Xeon Phi processor (named “Knights Landing”), which will have more than 60 cores per processor.

Berkeley Lab is the first Department of Energy laboratory to be named an IPCC. Other IPCCs are located at leading universities and research institutions around the world.

About Berkeley Lab Computing Sciences

The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy’s research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe. ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 5,500 scientists at national laboratories and universities, including those at Berkeley Lab’s Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation.

—

Source: Lawrence Berkeley National Laboratory