AMD’s public relations blitz for its new quad-core processors is winding down now. The company’s roll-out of its latest Opterons was welcome news for AMD’s OEM partners. System vendors like Sun, Appro and others were eager to announce their new Opteron-equipped boxes. The lack of an AMD counterpart to the newest Xeon quads created somewhat of a vacuum in the x86 server market, especially at the high-end. (For a more in-depth look at AMD’s new quad-core offerings, take a look at this week’s feature article.) While the impact of the quad-core Opterons in the overall server market will take some time to develop, their effect in the HPC universe will be almost instantaneous.
The new Sun Constellation “Ranger” supercomputer at the Texas Advanced Computing Center (TACC) will be outfitted with quad-core Opteron-based blades this month. The Ranger machine is the result of a $59 million NSF grant awarded to TACC last fall. When deployed, the system will contain nearly 16 thousand of the latest Opteron processors — 15,744 to be exact. The folks at TACC are eager to get the system built so they can start running the kinds of scientific workloads reserved for the most elite systems.
With four cores per processor and each core rated at 8 gigaflops (for a 2.0 GHz CPU), Ranger is expected to achieve a peak performance just north of 500 teraflops. If the Linpack benchmark is able to utilize 75 percent of that capacity, which is not an unreasonable assumption, Ranger will hit 370 (Linpack) teraflops. That figure would best the current number one supercomputer on the Top500 list — the IBM Blue Gene/L system at Lawrence Livermore. With a peak performance of 367 teraflops and a Linpack rating of 280.6 teraflops, Blue Gene has been the top system on the list for the past two years. Now that the new Opterons are on their way to Texas, Blue Gene’s dominance may be coming to an end.
But TACC and Sun better hurry. The submission deadline for the November Top500 list is October 15. According to Tommy Minyard, TACC’s assistant director for advanced computing systems, they’re certainly going to try to beat the deadline. He says the first Sun blades should start arriving next week and all the hardware should be installed by the first week of October. Plenty of time.
Ranger is the first commercial deployment of the recently announced Sun Constellation system, an architecture based on the new high-density Sun Blade 6000 technology. While those blades may host Opteron, Xeon, or UltraSPARC processors, the Opterons have the best fit for high-density HPC systems. Just 3,936 four-processor nodes will be required to achieve half a petaflop of performance.
Next to the blades themselves, Sun’s new 3456-port InfiniBand switch is the most critical piece of the system. Only two of these mega-switches will be required for the entire 4000-node Ranger cluster. The InfiniBand switches will also provide a level of performance that will make the cluster act more like a true supercomputer. Minyard says that MPI latencies will be as low as 1.5 microseconds across two blades in the same chassis and only 2.3 microseconds across the entire fabric. That’s nearly twice as fast as what could be achieved in a typical InfiniBand setup.
TACC is already lining up applications to run on the new system. They’ve been compiling and tuning molecular dynamics codes using pre-production quad-core samples from AMD. Kazushige Goto, TACC’s legendary code wizard is tuning the new BLAS libraries for AMD’s latest chips. According to Minyard, Goto’s been able to extract even more performance out of the hardware than even they expected. These are exciting times for the folks at TACC.
Only slightly less fortunate is Oak Ridge National Laboratory (ORNL) and their “Jaguar” XT4 supercomputer. Cray is still waiting on the quad-core “Budapest” chips from AMD so they can upgrade Jaguar to 250 teraflops (peak performance). Budapest is the single-socket version of the new quad-core Opterons, whose delivery was pushed back when the multi-socket “Barcelona” quad-core schedule slipped. The single-socket quads are scheduled to be released in Q4 2007 or Q1 2008 depending on who you talk to. These processors will be used mainly for single-socket workstations, but Cray needs bushels of them to outfit new XT4 systems that have been purchased by a few select government agencies and national labs, like ORNL.
The late delivery of the Budapest chips resulted in Cray lowering its 2007 revenue projections, which means the company will almost certainly not post a profit this year. Cray has apparently been promised an unspecified number of Budapest parts for 2007 so that it can begin shipments of quad-core equipped XT4s before the end of this year. Presumably this means Jaguar will get its quads in time for Christmas. But formal customer acceptance of the system and the associated revenue won’t occur until 2008. By late 2008, the one-petaflop Cray “Baker” system will be installed at ORNL. Baker will also use the new quad-core processors.
This is not to say the Opteron architecture will have a lock on high-end supercomputing. The recently announced Blue Gene/P, based on the PowerPC processor, should provide some stiff competition. Argonne National Laboratory purchased a 114-teraflop system, which will eventually scale to half a petaflop. Other Blue Gene/P systems were purchased by Max Planck Society and Forschungszentrum Jülich. Beyond that, IBM has designs on multi-petaflop systems based on the POWER7 processor. And all the HPC system vendors are looking at building machines from multiple architectures, using more exotic processors like the Cell, FPGAs, GPUs, and ClearSpeed devices to achieve even greater levels of performance. For now though, the Opteron is enjoying its day in the Sun.
As always, comments about HPCwire are welcomed and encouraged. Write to me, Michael Feldman, at [email protected].