The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
May 18, 2007
The Linux Cluster Institute (LCI) Conference focus this year was on big clusters. Not necessarily on raw performance per se, but on every other factor required to acquire, host, provision, maintain and achieve scalable performance for systems as a whole.
The first two keynotes set the tone by describing the perils and pitfalls of installing huge systems and getting them to perform. Even after a few years, all of the pieces don't necessarily play together well enough to meet the original design objectives. Horst Simon began the first day with an excellent philosophical discussion about the current state of high performance computing (HPC), hardware architecture, and the political atmosphere surrounding the drive to assemble the worlds' first petaflop machines. He noted that even though we have started construction of a petaflop computer, there are presently only two general-purpose machines in the world capable of 100+ teraflops on the Linpack benchmark.
This was a perfect segue from the opening keynote Monday evening by Robert Ballance of Sandia National Laboratory (SNL) about the difficulties of assembling Red Storm and getting it to perform. Even though Sandia has years of experience building and maintaining some of the largest supercomputers in the world, Red Storm turned out to be a unique experience for them. Why? Because it was much bigger than anything they had previously built. So the old saw in computing, "if it's 10x bigger, it is something entirely new," still holds, and we should not expect a petaflop machine to come together quietly at this moment in HPC time.
One interesting observation which Horst made in his talk is that programming a 100,000+ core machine using MPI is akin to programming each transistor individually by hand on the old Motorola 68000 processor, which of course had only 68,000 transistors. That wasn't so long ago to most of us, and his point is that we can't grow too much more in complexity unless we have some new software methodology for dealing with large systems.
The discussions generated by his comments never really addressed the fact explicitly that we are going to need new compiler technology sooner rather than later to handle the complexity. Neither MPI or OpenMP are the answers by themselves.
The rest of the talks on day one had a heavy emphasis on parallel I/O systems, and the difficulties of getting them to scale on large cluster systems. The problem here is that some of the tests can take so long (Laros, SNL) that the production system would be unavailable for unacceptably long periods of time. So I/O system administrators are forced to do simulations of the I/O systems on smaller development configurations. Presently, it seems that scalable I/O systems are limited to about one KiloClient (my term) for single-process/single-file I/O scenarios. Forget about it if you're talking about shared-file I/O. I think this is still pretty darn good progress, but the performance variability of these I/O systems is large, and it appears that their performance is very sensitive to a huge number of environmental parameters. Repeatability seems to be somewhere over the HPC horizon.
One more issue pertaining to large I/O systems: "operability" is not a synonym for "capability."
An interesting talk by Andrew Uselton and Brian Behlendorf from Lawrence Livermore National Laboratory discussed the difficulties they had with the I/O system delivered with Blue Gene/L. They "sweated bullets" (their term, not mine) for six months trying to get the I/O system to perform up to design specs. Internally, they referred to it as "the death march." The system, as delivered, "worked." However, the severely oversubscribed network design left them with an initial performance deficit of 50 percent of the target of 30+ GB/sec. This seems to be akin to spending two hundred grand on a Ferrari and discovering that it won't get you to the market faster than your neighbors' Buick without considerable tuning. Not that I'm blaming IBM. This talk could have addressed systems from every other manufacturer. There was no sensible way to build the I/O system without oversubscription at that time. It just points out that these complex systems that push the state of the art do not come out of the box ready for prime time.
Hardware and Software Sessions
The second day of the conference was a sandwich of hardware and software sessions. The morning keynote by Norman Miller (UC Berkeley) discussed the usage of cluster-enabled climate modeling software to predict the impact of global warming on California's Sierra mountains snowpack. It's not a pretty picture. This work has thrust him into the state government political system. The message here is the success of the open-source WRF (Weather Research & Forecasting) project. Norman and his colleague Jin have added unique capabilities to the WRF code in order to do these simulations and will deliver these improvements to the WRF project for use by other climate researchers.
Page: 1 of 2(Digg, Technorati, more)
White Paper: HPC in a Green and Modular Solution Building Block
Learn how the Appro GreenBlade™ System helps consolidate server, storage, network, power and simplified management capabilities in a single package while providing the performance-density, energy-efficiency and best ROI for your business.
Jul 01 | GenomeWeb Daily News | The popularity of cloud computing in the life sciences community was on full display at April's Bio-IT World conference. Read more...
Jul 01 | Linux Magazine | How can getting to the ocean help with HPC computing? Read more...
Jun 29 | GCN.com | Agency issues RFI for "Ubiquitous High Performance Computing" systems. Read more...
Jun 29 | Computerworld | The bottom of the TOP500 reveals the coming revolution in truly accessible high-end computing. Read more...
Jun 18 | EE Times | Parallel software also takes spotlight at Stanford confab. Read more...
Apr 14 | | Many HPC IT departments are feeling the rising pressure to deliver more capacity computing and performance while trying to reduce the total cost of ownership. This white paper discusses how an environmentally-friendly and open-standards HPC building block based computing system using flexible interconnect options helps address capacity computing needs.
Source: Addison Snell, GM/VP, Tabor Research; sponsored by Dell
Many organizations that could benefit from the use of HPC clusters find that it is complicated to get the systems up and running because of limited IT resources or the complexities of the clusters themselves. Learn how the Intel Cluster Ready program, for which Dell was an original partner, seeks to address this challenge for entry level and mid-range HPC users.
BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.
Sun Studio Compilers and Tools and Sun HPC ClusterTools allow you to create high performance parallel applications for OpenSolaris, Solaris and Linux. Sun Studio Express 11/08 includes MPI performance analysis capabilities and full OpenMP 3.0 compiler support. Learn about all this and the latest in Sun HPC ClusterTools 8.1.