Dell to Build 10-Petaflop Supercomputer For Science

By Michael Feldman

September 22, 2011

The Texas Advanced Computing Center (TACC) has revealed plans to deploy a cutting-edge petascale supercomputer courtesy of a $27.5 million dollar NSF award. Built by Dell, the system will consist of 2 petaflops of Sandy Bridge-EP processors with an 8 petaflop boost from Intel’s Many Integrated Core (MIC) coprocessors. The machine is scheduled to boot up in late 2012 and be ready for production in January 2013.

Not only is this Dell’s first petascale system — at least the first one announced publicly — it will likely be the first deployment of Intel’s commercial MIC technology. In this case, the chips in question are pre-production versions of Knights Corner, the first commercial part in that product line. These early chips will be identical to the future production parts.

Stampede, as the system will be called, is meant to serve both traditional number crunching HPC applications and data-driven analytics applications within NSF’s eXtreme Digital (XD) user community. XD includes the Extreme Science and Engineering Discovery Environment (XSEDE) project, the sucessor to TeraGrid that encompasses more than a dozen universities and two research labs. At 10 teraflops, Stampede will be the most powerful resource for XD users.

According to Jay Boisseau, TACC Director and PI of the Stampede project, the system is expected to have several hundred projects running on it from day one. “We want to bring in users with big data sets that are doing large-scale analyses, as well as the simulations types of users,” he told HPCwire.

Data-intensive science applications include traditional ones like bioinformatics, but also codes from geosciences and astronomy — application domains that are already accumulating large amounts of digital data. Boisseau thinks as much as half of Stampede’s resources will be devoted to these types of applications.

The data-intensive support will bring in a new set of users, many of which are not as HPC savvy as the traditional simulations folks. For that, Boisseau is planning to develop a much richer software environment for this group, including new application portals and gateways, as was begun under the TeraGrid project. In addition, they will also look to bring in experts in statistics, data mining, data management, and so on, in order to support the data-driven application domain.

Some of the expertise and software resources are already built into the project via university collaborations. Besides The University of Texas at Austin, partner schools include Clemson University, University of Colorado at Boulder, Cornell University, Indiana University, Ohio State University, and The University of Texas at El Paso.

Hardware-wise, the foundation of Stampede is a 2 petaflop cluster with 6,400 x86 compute nodes, lashed together with FDR (56 Gbps) InfiniBand from Mellanox. Each node will house two of Intel’s 8-core Xeon E5 (aka Sandy Bridge-EP) and 32 GB of DRAM.

Stampede will also include 16 big memory nodes, each sporting 1 terabyte of DRAM and 2 NVIDIA GPUs. Memory-wise, that’s not exactly in SGI Altix UV territory, but it’s a respectable capacity for extra-large SMP applications. Boisseau says they’re also considering ScaleMP’s virtual SMP solution to construct a shared memory environment across all 16 TB. The shared memory sub-cluster is slated to be used for some of the big data analytics applications that Stampede will host.

The cluster will also be hooked up to to Lustre storage nodes, also suppled by Dell. It will consist of 14 PB of disk, and deliver an aggregated bandwidth of 150 GB/second. “Over the lifetime of the project we’re expecting that to grow substantially both in capacity and bandwidth over the lifetime of the system,” said Boisseau.

The Dell system was developed by its Data Center Solutions division, under the code-name Zeus. Although the technology will debut in Stampede, the company is expecting to make the Zeus product generally available for “hyperscale” supercomputing in 2012.

Stampede’s base cluster and storage nodes represent the lion’s share of the NSF funding at $25 million. The remaining $2.5 million will go toward 8 petaflops worth of MIC coprocessors, which will be hooked into the x86 nodes via PCIe 3.0 links. MIC is Intel’s x86-based manycore HPC architecture aimed at highly parallel codes, and competes head on with NVIDIA’s Tesla and AMD’s Firestream GPUS.

GPGPU enthusiasts were not completely slighted though. Besides the GPUs in the shared memory nodes, 128 of the 6,400 regular nodes will be outfitted with NVIDIA’s next-generation Kepler GPUs to support remote visualization. Kepler is the successor to Fermi, NVIDIA’s current GPU architecture. Tesla implementations of Kepler aimed at HPC servers should begin shipping sometime in 2012.

Intel has not announced an official launch date for the Knights Corner MIC product, but it should be generally available sometime in 2013, or perhaps late 2012 if Intel’s 22nm process technology ramps up more quickly. The actual number of MICs in Stampede is not public, but Intel has promised them enough to deliver 8 peak petaflops.

Using a little quick math, each MIC chip will probably need to deliver at least 1.3 to 1.5 double precision teraflops to hit the 8 petaflop performance target. Coincidentally, the NVIDIA’s Kepler GPU is also expected to deliver about 1.3 to 1.5 double precision teraflops. Note that the first MIC parts will be implemented with Intel’s Tri-Gate 22nm technology, while the Kepler GPUs will be manufactured on standard 28nm technology.

At this point, Boisseau is expecting to receive all the Intel MIC coprocessors sometime this fall, possibly in time for a Linpack run at the November’s TOP500. By that time, all the Sandy Bridge compute nodes should be fully deployed. If all goes according to plan, early access users should be able to start running codes on the machine by December 2012.

Although MIC will support a number of parallel computing models, the most straightforward one is OpenMP. This will be especially advantageous for users with hybrid MPI-OpenMP codes. The idea would be to just offload the OpenMP chunks to the coprocessors in order to parallelize those loops. Users with straight MPI codes will need to do more work to tap into MIC acceleration.

There is already an upgraded version of Stampede on the drawing board. About two years into the project, TACC is planning to deploy the second generation MIC coprocessors, with another (smaller) batch of chips. The goal is to add 5 more petaflops to the system, bringing its grand total to 15 peak petaflops sometime around the middle of the decade.

The NSF is will be funding Stampede for at least four years. Besides the inital $27.5 million outlay to build and install the system, an additional $24 million or so for system operation and support is expected to be on the table soon, bringing the total Stampede investment to more than $50 million. The project also includes an option for renewal in 2017, which would result in the deployment of an even larger and more powerful machine toward the end of the decade.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Exascale Escapes 2018 Budget Axe; Rest of Science Suffers

May 23, 2017

President Trump's proposed $4.1 trillion FY 2018 budget is good for U.S. exascale computing development, but grim for the rest of science and technology spend Read more…

By Tiffany Trader

Hedge Funds (with Supercomputing help) Rank First Among Investors

May 22, 2017

In case you didn’t know, The Quants Run Wall Street Now, or so says a headline in today’s Wall Street Journal. Quant-run hedge funds now control the largest Read more…

By John Russell

IBM, D-Wave Report Quantum Computing Advances

May 18, 2017

IBM said this week it has built and tested a pair of quantum computing processors, including a prototype of a commercial version. That progress follows an an Read more…

By George Leopold

PRACEdays 2017 Wraps Up in Barcelona

May 18, 2017

Barcelona has been absolutely lovely; the weather, the food, the people. I am, sadly, finishing my last day at PRACEdays 2017 with two sessions: an in-depth loo Read more…

By Kim McMahon

HPE Extreme Performance Solutions

Exploring the Three Models of Remote Visualization

The explosion of data and advancement of digital technologies are dramatically changing the way many companies do business. With the help of high performance computing (HPC) solutions and data analytics platforms, manufacturers are developing products faster, healthcare providers are improving patient care, and energy companies are improving planning, exploration, and production. Read more…

US, Europe, Japan Deepen Research Computing Partnership

May 18, 2017

On May 17, 2017, a ceremony was held during the PRACEdays 2017 conference in Barcelona to announce the memorandum of understanding (MOU) between PRACE in Europe Read more…

By Tiffany Trader

NSF, IARPA, and SRC Push into “Semiconductor Synthetic Biology” Computing

May 18, 2017

Research into how biological systems might be fashioned into computational technology has a long history with various DNA-based computing approaches explored. N Read more…

By John Russell

DOE’s HPC4Mfg Leads to Paper Manufacturing Improvement

May 17, 2017

Papermaking ranks third behind only petroleum refining and chemical production in terms of energy consumption. Recently, simulations made possible by the U.S. D Read more…

By John Russell

PRACEdays 2017: The start of a beautiful week in Barcelona

May 17, 2017

Touching down in Barcelona on Saturday afternoon, it was warm, sunny, and oh so Spanish. I was greeted at my hotel with a glass of Cava to sip and treated to a Read more…

By Kim McMahon

Exascale Escapes 2018 Budget Axe; Rest of Science Suffers

May 23, 2017

President Trump's proposed $4.1 trillion FY 2018 budget is good for U.S. exascale computing development, but grim for the rest of science and technology spend Read more…

By Tiffany Trader

Cray Offers Supercomputing as a Service, Targets Biotechs First

May 16, 2017

Leading supercomputer vendor Cray and datacenter/cloud provider the Markley Group today announced plans to jointly deliver supercomputing as a service. The init Read more…

By John Russell

HPE’s Memory-centric The Machine Coming into View, Opens ARMs to 3rd-party Developers

May 16, 2017

Announced three years ago, HPE’s The Machine is said to be the largest R&D program in the venerable company’s history, one that could be progressing tow Read more…

By Doug Black

What’s Up with Hyperion as It Transitions From IDC?

May 15, 2017

If you’re wondering what’s happening with Hyperion Research – formerly the IDC HPC group – apparently you are not alone, says Steve Conway, now senior V Read more…

By John Russell

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

HPE Launches Servers, Services, and Collaboration at GTC

May 10, 2017

Hewlett Packard Enterprise (HPE) today launched a new liquid cooled GPU-driven Apollo platform based on SGI ICE architecture, a new collaboration with NVIDIA, a Read more…

By John Russell

IBM PowerAI Tools Aim to Ease Deep Learning Data Prep, Shorten Training 

May 10, 2017

A new set of GPU-powered AI software announced by IBM today brings automation to many of the tedious, time consuming and complex aspects of AI project on-rampin Read more…

By Doug Black

Bright Computing 8.0 Adds Azure, Expands Machine Learning Support

May 9, 2017

Bright Computing, long a prominent provider of cluster management tools for HPC, today released version 8.0 of Bright Cluster Manager and Bright OpenStack. The Read more…

By John Russell

Quantum Bits: D-Wave and VW; Google Quantum Lab; IBM Expands Access

March 21, 2017

For a technology that’s usually characterized as far off and in a distant galaxy, quantum computing has been steadily picking up steam. Just how close real-wo Read more…

By John Russell

Trump Budget Targets NIH, DOE, and EPA; No Mention of NSF

March 16, 2017

President Trump’s proposed U.S. fiscal 2018 budget issued today sharply cuts science spending while bolstering military spending as he promised during the cam Read more…

By John Russell

Google Pulls Back the Covers on Its First Machine Learning Chip

April 6, 2017

This week Google released a report detailing the design and performance characteristics of the Tensor Processing Unit (TPU), its custom ASIC for the inference Read more…

By Tiffany Trader

HPC Compiler Company PathScale Seeks Life Raft

March 23, 2017

HPCwire has learned that HPC compiler company PathScale has fallen on difficult times and is asking the community for help or actively seeking a buyer for its a Read more…

By Tiffany Trader

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Last week, Google reported that its custom ASIC Tensor Processing Unit (TPU) was 15-30x faster for inferencing workloads than Nvidia's K80 GPU (see our coverage Read more…

By Tiffany Trader

CPU-based Visualization Positions for Exascale Supercomputing

March 16, 2017

Since our first formal product releases of OSPRay and OpenSWR libraries in 2016, CPU-based Software Defined Visualization (SDVis) has achieved wide-spread adopt Read more…

By Jim Jeffers, Principal Engineer and Engineering Leader, Intel

TSUBAME3.0 Points to Future HPE Pascal-NVLink-OPA Server

February 17, 2017

Since our initial coverage of the TSUBAME3.0 supercomputer yesterday, more details have come to light on this innovative project. Of particular interest is a ne Read more…

By Tiffany Trader

Facebook Open Sources Caffe2; Nvidia, Intel Rush to Optimize

April 18, 2017

From its F8 developer conference in San Jose, Calif., today, Facebook announced Caffe2, a new open-source, cross-platform framework for deep learning. Caffe2 is Read more…

By Tiffany Trader

Leading Solution Providers

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

Tokyo Tech’s TSUBAME3.0 Will Be First HPE-SGI Super

February 16, 2017

In a press event Friday afternoon local time in Japan, Tokyo Institute of Technology (Tokyo Tech) announced its plans for the TSUBAME3.0 supercomputer, which w Read more…

By Tiffany Trader

Is Liquid Cooling Ready to Go Mainstream?

February 13, 2017

Lost in the frenzy of SC16 was a substantial rise in the number of vendors showing server oriented liquid cooling technologies. Three decades ago liquid cooling Read more…

By Steve Campbell

MIT Mathematician Spins Up 220,000-Core Google Compute Cluster

April 21, 2017

On Thursday, Google announced that MIT math professor and computational number theorist Andrew V. Sutherland had set a record for the largest Google Compute Eng Read more…

By Tiffany Trader

IBM Wants to be “Red Hat” of Deep Learning

January 26, 2017

IBM today announced the addition of TensorFlow and Chainer deep learning frameworks to its PowerAI suite of deep learning tools, which already includes popular Read more…

By John Russell

HPC Technique Propels Deep Learning at Scale

February 21, 2017

Researchers from Baidu's Silicon Valley AI Lab (SVAIL) have adapted a well-known HPC communication technique to boost the speed and scale of their neural networ Read more…

By Tiffany Trader

US Supercomputing Leaders Tackle the China Question

March 15, 2017

As China continues to prove its supercomputing mettle via the Top500 list and the forward march of its ambitious plans to stand up an exascale machine by 2020, Read more…

By Tiffany Trader

DOE Supercomputer Achieves Record 45-Qubit Quantum Simulation

April 13, 2017

In order to simulate larger and larger quantum systems and usher in an age of "quantum supremacy," researchers are stretching the limits of today's most advance Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Share This