Stanford HPC Center Rocks with Intel Cluster Ready Solution

By Steve Jones

March 28, 2008

In just eleven days during 2007, the Stanford University High-Performance Computing (HPC) Center nearly doubled the performance of its existing compute system. The center leveraged certification methodology from the Intel Cluster Ready Program to fully implement a 1,696-core cluster solution. The solution integrates Clustercorp, Dell and Panasas technologies to give the center the flexibility to meet ever-expanding computational and application requirements and to enable Stanford researchers to achieve faster time-to-results. Steve Jones, the founder and manager of the Stanford HPC Center, writes about the design and deployment of this system.

Mission: Enable CFD on Demand

The goal of our expansion was simple: acquire sufficient compute power to facilitate School of Engineering coursework and research efforts and to support the university’s industrial affiliates program.

The system would need to support more than 200 researchers and effectively enable CFD on demand. Two departments in particular require large-scale, massively parallel computing resources for their work. Researchers in the mechanical engineering and aeronautics and astronautics departments leverage HPC Center resources to analyze the details of flow and acoustics created by helicopters in forward flight. Critical applications include two major in-house-developed simulation codes: Stanford University multiblock (SUmb) and CDP, named for the late Charles David Pierce. Commercial applications include ANSYS, Gaussian, MatLab, and VASP. Stanford’s post-processing programs include EnSight for distributed rendering and Tecplot for visualization.

Objective: Extend In-House Support for Complex Code Verification and Validation

To address Stanford researchers’ increasing needs for code validation, our team opted to bolster local compute resources. The goal was to build a cluster-based solution capable of scaling to thousands of nodes and supporting our large user base. The minimum expectation for the cluster was to run routine jobs in-house and to allow researchers to do more extensive verification and validation of code destined for the national labs. In addition to the large compute capacity and scalability requirements, the system needed to be capable of sustained performance with a file system fast enough to cope with massive I/O load and to deliver the granularity of results researchers require.

Challenge: Overcome Limitations in Space, Time, and Staff

While our plan was to dramatically expand compute resources, the project was bound by constraints of space, time and staff. The Stanford HPC Center has limited square footage in which to operate, so footprint was an issue, not only in terms of the system size and density, but also in how it would impact general HVAC requirements. Extensive project demands also applied scheduling pressures that propelled the deployment team to implement a more aggressive rollout than the three- to six-month timeframe more typically allotted for implementations of this scope. The final challenge was the center’s requirements to grow overall services delivery while maintaining the existing support staff.

We felt that taking advantage of an integrated solution would help us meet our compute and deployment objectives without exceeding our space, time, and staffing limitations. In particular, we were sold on the idea of the Intel Cluster Ready (ICR) program where certification of the Clustercorp/Dell/Panasas solution had been completed upfront. The certification provides cluster components validation to ensure accurate configuration, optimal performance, and a system that is easy to manage and seamlessly expand.

Solution: Use Certified Best-of-Breed Compute, Distribution, and Storage Components

With the assistance of the Dell Advanced Systems Group and following extensive review of available technologies, we began to solidify plans. Specifically, the Dell team worked with us to design and build the final cluster-based solution architecture that incorporates Dell Power Edge Servers using Intel Quad-Core Xeon processors and compilers with Clustercorp Rocks+ cluster distribution software, and Panasas ActiveStor Parallel storage. The design includes:

  • Dell PowerEdge 1950 servers. Standards-based product, a 1U form-factor, remote management and diagnostic functionality, scalability, and parts availability were important factors in the selection of Dell servers for the HPC Center compute nodes.
  • Cisco 7024 InfiniBand switch. High-speed, full nonblocking internode communications help ensure that compute nodes do not waste cycles waiting for messages.
  • Panasas ActiveStor AS3000 parallel storage (with object-based Panasas PanFS parallel file system). Determining factors included Intel Cluster ready certification (Panasas ActiveStor Storage Clusters were actually the first parallel storage solution to be certified as part of the ICR program), MPI I/O optimization, and bandwidth per storage shelf.
  • Clustercorp Rocks+ 4.3 (with CentOS 4.5) and Clustercorp Intel Developer, Fluent, Moab, Cisco OFED, and Panasas Rolls. Rocks+ cluster distribution offers comprehensive application integration packages (i.e. Rocks+Rolls) that are essential for large-cluster configuration. Using Rolls to add software stacks with “checkbox” application distribution is a nice relief for system administrators who remember the old days of installation, distribution, configuration, scripting, and testing for every application added to the system. Rolls for Fluent, EnSight, and other higher-level applications greatly add to the simplicity of building an end-to-end solution.
  • Dell Deployment Services. The Dell Advanced Systems Group and the company’s ecosystem of partners, including APC, Cisco, Clustercorp, Intel, and Panasas, came together to architect and implement the solution seamlessly and in record time. Expertise and comprehensive planning were critical to the rapid deployment.
  • APC Hot Aisle Containment System. This chilled-water, row-cooling system enables greater rack density (and therefore smaller footprint) than would be possible using traditional room cooling.

In terms of the selection of the storage component, let me add further explanation of our selection of the Panasas parallel solution. Unlike NFS appliances and other clustered-storage products we considered, the Panasas storage system enhanced application performance to deliver the parallel I/O efficiency and stability our researchers require. Our experience with NFS has been that as we start to increase the number of processors or processes writing in parallel to the file system, we overwhelm the appliance and cannot successfully run a large simulation. That’s why we looked for a storage solution specifically optimized for parallel processing environments.

We did run direct comparisons of Panasas PanFS parallel file system against an array of other file systems, including those commonly targeted to HPC clusters. The PanFS parallel file system consistently outperformed them–in fact, when we ran our simulation and modeling applications, Panasas parallel storage allowed it to run significantly faster. We’ve used Panasas parallel storage in-house for more than three years now because we find it to be the highest-performing, most manageable, and most reliable of the leading storage solutions designed for high-performance computing.

Result: Achieve 2-14X Performance Improvement After an 11-Day Deployment

The entire deployment, including implementation of an entirely new power and cooling infrastructure, took a total of eleven days. The Dell Enterprise Deployment team played an integral role in this feat, coordinating the efforts of all participating vendors. The power, cooling, and system build-out were completed in parallel. We used the Rocks+ Linux cluster distribution to configure master and compute nodes, and by day eleven our researchers were able to submit jobs that were flawlessly executed and producing scientific code and operations with unprecedented fidelity. The new cluster easily handles ten times the workload of the original 48-node configuration. Testing results show performance of 15.8 teraflops performance compared to 1.1 teraflops delivered by the smaller cluster.

The Stanford HPC Center currently supports fifteen different types of HPC systems, including the Intel Cluster Ready system based on the Clustercorp/Dell/Panasas solution. The original 48-node system is now used by students and researchers working on smaller-scale problems. The ease-of-use of commodity-based hardware, efficient cluster distribution, and appliance-like ease of storage management allow the center to maintain a small support staff.

Research Impact: Achieve Faster and More Accurate Time-To-Results

The new cluster is paying off well for our center. It has delivered faster time-to-results by:

  • Allowing researchers to run more simulations and other heavy-compute jobs in-house.
  • Delivering finer detail or higher fidelity so that scientists more quickly and easily recognize salient features or other important phenomena.
  • Enabling more thorough verification and validation of complex codes destined to be submitted to the 10,000-65,000-processor systems at the National Labs.

With this new, more powerful, stable and manageable cluster in place, our researchers are better able to focus on their science and deliver consistent, meaningful results to project sponsors. Deployment of the new cluster has also brought added recognition to the Stanford High Performance Computing Center, helping us achieve a ranking of 130 on the Top 500 list in November 2007.

Meanwhile, our compute infrastructure continues to grow — the HPC Center is currently doubling the size of the existing cluster and designing an additional system based on a similar architecture. The integrated, standards-based solution and the Intel Cluster Ready certification combine to give the Stanford High Performance Computing Center the flexibility to add or change cluster elements to meet specific computational and application requirements. Stanford’s industrial partners are also able take advantage of the system to enable more traditional operations and expand their own research and computing services.

About the Author

Steve Jones currently runs the High Performance Computing Center at Stanford University, supporting sponsored research for The Department of Energy Advanced Simulation and Computing Program (ASC), and the next-generation Predictive Science Academic Alliance Program (PSAAP). The HPC Center also supports the computational needs of sponsored research for National Aeronautics and Space Administration (NASA), Air Force Office of Scientific Research (AFOSR) and Defense Advanced Research Projects Agency (DARPA). Jones is the chair of the annual Stanford High Performance Computing Conference, has designed and currently administers numerous Top 500 Supercomputers, and speaks regularly about the management of High Performance Computing Clusters. More information can be found at http://hpcc.stanford.edu and http://psaap.stanford.edu.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

China’s Tencent Server Design Will Use AMD Rome

November 13, 2019

Tencent, the Chinese cloud giant, said it would use AMD’s newest Epyc processor in its internally-designed server. The design win adds further momentum to AMD’s bid to erode rival Intel Corp.’s dominance of the glo Read more…

By George Leopold

NCSA Industry Conference Recap – Part 1

November 13, 2019

Industry Program Director Brendan McGinty welcomed guests to the annual National Center for Supercomputing Applications (NCSA) Industry Conference, October 8-10, on the University of Illinois campus in Urbana (UIUC). On Read more…

By Elizabeth Leake, STEM-Trek

Cray, Fujitsu Both Bringing Fujitsu A64FX-based Supercomputers to Market in 2020

November 12, 2019

The number of top-tier HPC systems makers has shrunk due to a steady march of M&A activity, but there is increased diversity and choice of processing components with Intel Xeon, AMD Epyc, IBM Power, and Arm server ch Read more…

By Tiffany Trader

Intel AI Summit: New ‘Keem Bay’ Edge VPU, AI Product Roadmap

November 12, 2019

At its AI Summit today in San Francisco, Intel touted a raft of AI training and inference hardware for deployments ranging from cloud to edge and designed to support organizations at various points of their AI journeys. Read more…

By Doug Black

SIA Recognizes Robert Dennard with 2019 Noyce Award

November 12, 2019

If you don’t know what Dennard Scaling is, the chances are strong you don’t labor in electronics. Robert Dennard, longtime IBM researcher, inventor of the DRAM and the fellow for whom Dennard Scaling was named, is th Read more…

By John Russell

AWS Solution Channel

Making High Performance Computing Affordable and Accessible for Small and Medium Businesses with HPC on AWS

High performance computing (HPC) brings a powerful set of tools to a broad range of industries, helping to drive innovation and boost revenue in finance, genomics, oil and gas extraction, and other fields. Read more…

IBM Accelerated Insights

Leveraging Exaflops Performance to Remediate Nuclear Waste

November 12, 2019

Nuclear waste storage sites are a subject of intense controversy and debate; nobody wants the radioactive remnants in their backyard. Now, a collaboration between Berkeley Lab, Pacific Northwest National University (PNNL Read more…

By Oliver Peckham

Cray, Fujitsu Both Bringing Fujitsu A64FX-based Supercomputers to Market in 2020

November 12, 2019

The number of top-tier HPC systems makers has shrunk due to a steady march of M&A activity, but there is increased diversity and choice of processing compon Read more…

By Tiffany Trader

IBM Adds Support for Ion Trap Quantum Technology to Qiskit

November 11, 2019

After years of percolating in the shadow of quantum computing research based on superconducting semiconductors – think IBM, Rigetti, Google, and D-Wave (quant Read more…

By John Russell

Tackling HPC’s Memory and I/O Bottlenecks with On-Node, Non-Volatile RAM

November 8, 2019

On-node, non-volatile memory (NVRAM) is a game-changing technology that can remove many I/O and memory bottlenecks and provide a key enabler for exascale. Th Read more…

By Jan Rowell

MLPerf Releases First Inference Benchmark Results; Nvidia Touts its Showing

November 6, 2019

MLPerf.org, the young AI-benchmarking consortium, today issued the first round of results for its inference test suite. Among organizations with submissions wer Read more…

By John Russell

Azure Cloud First with AMD Epyc Rome Processors

November 6, 2019

At Ignite 2019 this week, Microsoft's Azure cloud team and AMD announced an expansion of their partnership that began in 2017 when Azure debuted Epyc-backed ins Read more…

By Tiffany Trader

Nvidia Launches Credit Card-Sized 21 TOPS Jetson System for Edge Devices

November 6, 2019

Nvidia has launched a new addition to its Jetson product line: a credit card-sized (70x45mm) form factor delivering up to 21 trillion operations/second (TOPS) o Read more…

By Doug Black

In Memoriam: Steve Tuecke, Globus Co-founder

November 4, 2019

HPCwire is deeply saddened to report that Steve Tuecke, longtime scientist at Argonne National Lab and University of Chicago, has passed away at age 52. Tuecke Read more…

By Tiffany Trader

Spending Spree: Hyperscalers Bought $57B of IT in 2018, $10B+ by Google – But Is Cloud on Horizon?

October 31, 2019

Hyperscalers are the masters of the IT universe, gravitational centers of increasing pull in the emerging age of data-driven compute and AI.  In the high-stake Read more…

By Doug Black

Supercomputer-Powered AI Tackles a Key Fusion Energy Challenge

August 7, 2019

Fusion energy is the Holy Grail of the energy world: low-radioactivity, low-waste, zero-carbon, high-output nuclear power that can run on hydrogen or lithium. T Read more…

By Oliver Peckham

Using AI to Solve One of the Most Prevailing Problems in CFD

October 17, 2019

How can artificial intelligence (AI) and high-performance computing (HPC) solve mesh generation, one of the most commonly referenced problems in computational engineering? A new study has set out to answer this question and create an industry-first AI-mesh application... Read more…

By James Sharpe

Cray Wins NNSA-Livermore ‘El Capitan’ Exascale Contract

August 13, 2019

Cray has won the bid to build the first exascale supercomputer for the National Nuclear Security Administration (NNSA) and Lawrence Livermore National Laborator Read more…

By Tiffany Trader

DARPA Looks to Propel Parallelism

September 4, 2019

As Moore’s law runs out of steam, new programming approaches are being pursued with the goal of greater hardware performance with less coding. The Defense Advanced Projects Research Agency is launching a new programming effort aimed at leveraging the benefits of massive distributed parallelism with less sweat. Read more…

By George Leopold

AMD Launches Epyc Rome, First 7nm CPU

August 8, 2019

From a gala event at the Palace of Fine Arts in San Francisco yesterday (Aug. 7), AMD launched its second-generation Epyc Rome x86 chips, based on its 7nm proce Read more…

By Tiffany Trader

D-Wave’s Path to 5000 Qubits; Google’s Quantum Supremacy Claim

September 24, 2019

On the heels of IBM’s quantum news last week come two more quantum items. D-Wave Systems today announced the name of its forthcoming 5000-qubit system, Advantage (yes the name choice isn’t serendipity), at its user conference being held this week in Newport, RI. Read more…

By John Russell

Ayar Labs to Demo Photonics Chiplet in FPGA Package at Hot Chips

August 19, 2019

Silicon startup Ayar Labs continues to gain momentum with its DARPA-backed optical chiplet technology that puts advanced electronics and optics on the same chip Read more…

By Tiffany Trader

Crystal Ball Gazing: IBM’s Vision for the Future of Computing

October 14, 2019

Dario Gil, IBM’s relatively new director of research, painted a intriguing portrait of the future of computing along with a rough idea of how IBM thinks we’ Read more…

By John Russell

Leading Solution Providers

ISC 2019 Virtual Booth Video Tour

CRAY
CRAY
DDN
DDN
DELL EMC
DELL EMC
GOOGLE
GOOGLE
ONE STOP SYSTEMS
ONE STOP SYSTEMS
PANASAS
PANASAS
VERNE GLOBAL
VERNE GLOBAL

Intel Confirms Retreat on Omni-Path

August 1, 2019

Intel Corp.’s plans to make a big splash in the network fabric market for linking HPC and other workloads has apparently belly-flopped. The chipmaker confirmed to us the outlines of an earlier report by the website CRN that it has jettisoned plans for a second-generation version of its Omni-Path interconnect... Read more…

By Staff report

Kubernetes, Containers and HPC

September 19, 2019

Software containers and Kubernetes are important tools for building, deploying, running and managing modern enterprise applications at scale and delivering enterprise software faster and more reliably to the end user — while using resources more efficiently and reducing costs. Read more…

By Daniel Gruber, Burak Yenier and Wolfgang Gentzsch, UberCloud

Dell Ramps Up HPC Testing of AMD Rome Processors

October 21, 2019

Dell Technologies is wading deeper into the AMD-based systems market with a growing evaluation program for the latest Epyc (Rome) microprocessors from AMD. In a Read more…

By John Russell

Rise of NIH’s Biowulf Mirrors the Rise of Computational Biology

July 29, 2019

The story of NIH’s supercomputer Biowulf is fascinating, important, and in many ways representative of the transformation of life sciences and biomedical res Read more…

By John Russell

Intel Debuts Pohoiki Beach, Its 8M Neuron Neuromorphic Development System

July 17, 2019

Neuromorphic computing has received less fanfare of late than quantum computing whose mystery has captured public attention and which seems to have generated mo Read more…

By John Russell

Xilinx vs. Intel: FPGA Market Leaders Launch Server Accelerator Cards

August 6, 2019

The two FPGA market leaders, Intel and Xilinx, both announced new accelerator cards this week designed to handle specialized, compute-intensive workloads and un Read more…

By Doug Black

When Dense Matrix Representations Beat Sparse

September 9, 2019

In our world filled with unintended consequences, it turns out that saving memory space to help deal with GPU limitations, knowing it introduces performance pen Read more…

By James Reinders

With the Help of HPC, Astronomers Prepare to Deflect a Real Asteroid

September 26, 2019

For years, NASA has been running simulations of asteroid impacts to understand the risks (and likelihoods) of asteroids colliding with Earth. Now, NASA and the European Space Agency (ESA) are preparing for the next, crucial step in planetary defense against asteroid impacts: physically deflecting a real asteroid. Read more…

By Oliver Peckham

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This