First Details Emerge from Cray on Trinity Supercomputer

By Nicole Hemsoth

July 10, 2014

Note – 7:32 p.m. Eastern: We have full details from Los Alamos about the system in a detailed update article.

Cray has been granted one of the largest awards in its history for the long-awaited “Trinity” supercomputer. This morning the company announced a $174 million deal to provide the National Nuclear Security Administration (NNSA) with a multi-petaflop next generation Cray XC machine, complemented by an 82 petabyte capacity Cray Sonexion storage system. The goal of the new super is set to contend with the agency’s nuclear stockpiles, simulating everything from continued maintenance, degradation, and even destruction of the vast reserves, as well as hosting a wealth of classified national security applications.

The original proposal for the system suggest a need for a machine capable of up to 30 petaflops, and it looks like this might not be unrealistic given what we know about the architectural choices and the amount Cray inked into their revenue for the year today—causing a decent uptick in their stock price and tipping them into billion-dollar valuation territory.

The system will be powered by what sounds like a relatively balanced combination of future-generation Haswells (we’re guessing between 14-18 cores) and future Knights Landing processors (60+ cores), which represents a strategy that’s driven by a clear sense of NNSA application and simulation goals. We’ll do some speculative math in the coming week or so about what this system might actually look like since nothing has been released FLOPS or otherwise, but given so many unknowns in terms of final core counts of the Haswell and Knights Landing as well as pricing, we want to take our time on those guesses. But the early look we got with the formal announcement and our conversation with Cray denotes this is going to be core-heavy, FLOPS-centric powerhouse, even if it doesn’t meet the high-end 30 petaflop target.

Following a conversation this morning with Cray’s Barry Bolding, we learned there are two major phases in the deployment leading up to the acceptance testing late next year or into the following year, which is likely determined by Intel’s delivery of the new Xeons and Knights Landing chips versus any delays on Cray’s part. What’s interesting is that it sounds like it’s a balanced system between the two core types.

Bolding says that the processor updates are a defining factor in the next generation of their XC rather than an entirely new system set driven by custom engineering of the entire system’s interconnect, cooling, or other components. He does note that in the new generation of XC machines there has been extensive work done to support the large number of new Intel cores within the software stack, and ostensibly in their Sonexion storage to support the tiered storage demands for using burst buffers in novel ways. The idea, he says, is to make machines that are ready to roll into large deployments like this and the NERSC system instead of custom engineering systems based on particular user requirements.

“It’s hard to build these reproducible products at this scale that multiple sites agree they can all use. Our philosophy is to create these massive production systems and it’s good that we don’t have to custom design each one. There’s going to be an evolution of the software stack and new features we’re not talking about today, but there will be innovations—and that’s another reason it’s a multi-phased approach.”

“Each phase is significant in size—the first will predominantly be the next-generation Haswell processors, followed by the Knight’s Landing piece in a later phase.” They’re both major parts of the installation, one isn’t much larger than the other.

Other systems that are set to come online in the next year and a half may be reliant on more novel, diverse architectures, but with a very specific, known set of users and projects, it’s clear that the NNSA had a direct sense of how the additional cores (and presumably on-package memory of Knight’s Landing) would translate directly into meaningful results.

The system choice was driven by the need to secure a mixed workload system, hence the processor choice of both Knight’s Landing and Haswell cores (compared to the NERSC “Cori” supercomputer that Cray is building which is predominantly next-generation Knight’s Landing based). “The binary compatibility in their Xeon line is a Knight’s innovation when you want to do heterogeneous types of problems across different types of processors. It’s not super-unique, but it’s interesting that they want to do this at such large scale,” said Bolding.

NERSC’s system and Trinity are both XC systems, but these are different workloads with different mandates. NERSC has a broad user base as an open science DoE system serving thousands of applications and hundreds of users. The Trinity system will be used for more targeted weapons stockpile-related workloads.” He says it shows that the XC systems can be diverse enough to support both distinct user types and beyond.

Aside from the sheer core thrust from the Intel processors, one of the more interesting elements of the upcoming machine is the storage. Bolding says they wanted a very large, powerful Lustre environment and Sonexion met those requirements. We’ll be bringing more details on the burst buffer and general storage component later today following a conversation with one of the leads on that front at Los Alamos but for now, we have some initial details from Cray.

“Tiered storage (and burst buffers are a particular tier) will be more important for customers like this in the future but there is real interest in more than just Lustre at other tiers. We are working to develop this in multiple tiers to support these needs,” said Bolding.

This is among the largest deals in Cray’s history. The company had a multi-year DARPA contract valued initially at $250 million in 2006, although the final contract was closer to the amount of the Trinity system. The Blue Waters procurement, as tangled as it might have been in 2011, was around $200 million, and at Oak Ridge, other similar deals in terms of dollar value were secured. Still, this represents one of the top contracts for Cray—and we’re just getting into swing with procurement news, which will pick up now that there is clarity around when the latest Intel processors will roll out—something that undoubtedly is driving procurement timelines across the board.

The new supercomputer will be housed at Los Alamos National Laboratory and is part of a joint effort between the New Mexico Alliance for Computing at Extreme Scale (ACES), based at LANL, and Sandia National Laboratories’ NNSA Advanced Simulation and Computing Program (ASC).

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Watch Nvidia’s GTC21 Keynote with Jensen Huang Livestreamed Here at HPCwire

April 9, 2021

Join HPCwire right here on Monday, April 12, at 8:30 am PT to see the Nvidia GTC21 keynote from Nvidia’s CEO, Jensen Huang, livestreamed in its entirety. Hosted by HPCwire, you can click to join the Huang keynote on our livestream to hear Nvidia’s expected news and... Read more…

The US Places Seven Additional Chinese Supercomputing Entities on Blacklist

April 8, 2021

As tensions between the U.S. and China continue to simmer, the U.S. government today added seven Chinese supercomputing entities to an economic blacklist. The U.S. Entity List bars U.S. firms from supplying key technolog Read more…

Argonne Supercomputing Supports Caterpillar Engine Design

April 8, 2021

Diesel fuels still account for nearly ten percent of all energy-related U.S. carbon emissions – most of them from heavy-duty vehicles like trucks and construction equipment. Energy efficiency is key to these machines, Read more…

Habana’s AI Silicon Comes to San Diego Supercomputer Center

April 8, 2021

Habana Labs, an Intel-owned AI company, has partnered with server maker Supermicro to provide high-performance, high-efficiency AI computing in the form of new training and inference servers that will power the upcoming Read more…

Intel Partners Debut Latest Servers Based on the New Intel Gen 3 ‘Ice Lake’ Xeons

April 7, 2021

Fresh from Intel’s launch of the company’s latest third-generation Xeon Scalable “Ice Lake” processors on April 6 (Tuesday), Intel server partners Cisco, Dell EMC, HPE and Lenovo simultaneously unveiled their first server models built around the latest chips. And though arch-rival AMD may... Read more…

AWS Solution Channel

Volkswagen Passenger Cars Uses NICE DCV for High-Performance 3D Remote Visualization

 

Volkswagen Passenger Cars has been one of the world’s largest car manufacturers for over 70 years. The company delivers more than 6 million automobiles to global customers every year, from 50 production locations on five continents. Read more…

What’s New in HPC Research: Tundra, Fugaku, µHPC & More

April 6, 2021

In this regular feature, HPCwire highlights newly published research in the high-performance computing community and related domains. From parallel programming to exascale to quantum computing, the details are here. Read more…

The US Places Seven Additional Chinese Supercomputing Entities on Blacklist

April 8, 2021

As tensions between the U.S. and China continue to simmer, the U.S. government today added seven Chinese supercomputing entities to an economic blacklist. The U Read more…

Habana’s AI Silicon Comes to San Diego Supercomputer Center

April 8, 2021

Habana Labs, an Intel-owned AI company, has partnered with server maker Supermicro to provide high-performance, high-efficiency AI computing in the form of new Read more…

Intel Partners Debut Latest Servers Based on the New Intel Gen 3 ‘Ice Lake’ Xeons

April 7, 2021

Fresh from Intel’s launch of the company’s latest third-generation Xeon Scalable “Ice Lake” processors on April 6 (Tuesday), Intel server partners Cisco, Dell EMC, HPE and Lenovo simultaneously unveiled their first server models built around the latest chips. And though arch-rival AMD may... Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

RIKEN’s Ongoing COVID Research Includes New Vaccines, New Tests & More

April 6, 2021

RIKEN took the supercomputing world by storm last summer when it launched Fugaku – which became (and remains) the world’s most powerful supercomputer – ne Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

AI Systems Summit Keynote: Brace for System Level Heterogeneity Says de Supinski

April 1, 2021

Heterogeneous computing has quickly come to mean packing a couple of CPUs and one-or-many accelerators, mostly GPUs, onto the same node. Today, a one-such-node system has become the standard AI server offered by dozens of vendors. This is not to diminish the many advances... Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

Saudi Aramco Unveils Dammam 7, Its New Top Ten Supercomputer

January 21, 2021

By revenue, oil and gas giant Saudi Aramco is one of the largest companies in the world, and it has historically employed commensurate amounts of supercomputing Read more…

Quantum Computer Start-up IonQ Plans IPO via SPAC

March 8, 2021

IonQ, a Maryland-based quantum computing start-up working with ion trap technology, plans to go public via a Special Purpose Acquisition Company (SPAC) merger a Read more…

Leading Solution Providers

Contributors

Can Deep Learning Replace Numerical Weather Prediction?

March 3, 2021

Numerical weather prediction (NWP) is a mainstay of supercomputing. Some of the first applications of the first supercomputers dealt with climate modeling, and Read more…

Livermore’s El Capitan Supercomputer to Debut HPE ‘Rabbit’ Near Node Local Storage

February 18, 2021

A near node local storage innovation called Rabbit factored heavily into Lawrence Livermore National Laboratory’s decision to select Cray’s proposal for its CORAL-2 machine, the lab’s first exascale-class supercomputer, El Capitan. Details of this new storage technology were revealed... Read more…

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

African Supercomputing Center Inaugurates ‘Toubkal,’ Most Powerful Supercomputer on the Continent

February 25, 2021

Historically, Africa hasn’t exactly been synonymous with supercomputing. There are only a handful of supercomputers on the continent, with few ranking on the Read more…

The History of Supercomputing vs. COVID-19

March 9, 2021

The COVID-19 pandemic poses a greater challenge to the high-performance computing community than any before. HPCwire's coverage of the supercomputing response t Read more…

HPE Names Justin Hotard New HPC Chief as Pete Ungaro Departs

March 2, 2021

HPE CEO Antonio Neri announced today (March 2, 2021) the appointment of Justin Hotard as general manager of HPC, mission critical solutions and labs, effective Read more…

Microsoft, HPE Bringing AI, Edge, Cloud to Earth Orbit in Preparation for Mars Missions

February 12, 2021

The International Space Station will soon get a delivery of powerful AI, edge and cloud computing tools from HPE and Microsoft Azure to expand technology experi Read more…

AMD Launches Epyc ‘Milan’ with 19 SKUs for HPC, Enterprise and Hyperscale

March 15, 2021

At a virtual launch event held today (Monday), AMD revealed its third-generation Epyc “Milan” CPU lineup: a set of 19 SKUs -- including the flagship 64-core, 280-watt 7763 part --  aimed at HPC, enterprise and cloud workloads. Notably, the third-gen Epyc Milan chips achieve 19 percent... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire