The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
From the Editor | Main Blog Index
July 14, 2006
With all the talk of heterogeneous supercomputing over the last few years, one might get the impression that a revolution is on the horizon. Certainly, some in the industry have portrayed it as such. Non-scalar vector processors, coprocessor accelerators, MTA processors and FPGAs are available today and can offer tantalizing performance for targeted HPC workloads. The general idea behind heterogeneous processing is that a system containing different kinds of compute engines can be matched up with the type of code that runs most efficiently on them, increasing overall application performance.
From an evolutionary standpoint, heterogeneous processing makes sense. As systems become more complex, a greater amount of architectural specialization is required. This appears to be true for both man-made systems and biological systems. Compare the sophisticated structure of the human brain with the simple bundles of neurons that control many primitive invertebrates.
In the scheme of things, today's computers are still rather primitive themselves, but they already contain many heterogeneous elements. At the level of the chipset, specialized I/O and memory controller devices are commonly used to manage an increasing array of data sources and destinations. Computer memory has differentiated into distinct types, the most common ones being RAM, ROM and cache (3 levels). The CPU has remained as one of the last general-purpose components of the system. But as applications -- especially HPC applications -- become more complex and more demanding of computational performance, the pressure to tap other types of processing engines will increase.
FPGAs (Field Programmable Gate Arrays), in particular, have been getting a lot attention lately. They have gained a loyal following in the supercomputing community because they are reconfigurable, have wide applicability for HPC applications, and are commodity-based. And unlike coprocessors, vector processors and MTA processors, FPGAs are more general-purpose compute engines.
The growing interest of the HPC community in IBM's Cell chip is another example. Although the chip contains both a scalar (PowerPC) CPU and vector compute engines, it is not considered a true heterogenous processor itself. The scalar CPU is used to control the vector cores and manage the chip's memory hierarchy, rather than for computation. But theoretically the Cell could be used an additional compute resource within a conventional scalar-based system.
All of these non-scalar processors have one thing in common: compared to commodity CPUs, there is not much code support for them. So the software will have catch up. And that's not going to happen overnight.
Today, the HPC software community is focusing much of its energy on applying code parallelism to scalar processors. Homogeneous multi-core architectures are currently in the driver's seat in high performance computing, as it will soon be in almost all IT markets. This trend is likely to continue for some time. High-volume 64-bit processors that are supported by mature software ecosystems, such as the AMD Opteron and the Intel Xeon (and to a lesser extent, POWER/PowerPC and Itanium), are delivering economical supercomputing performance for the masses. The fact that other microprocessor architectures may be faster, cheaper or more energy-efficient than industry-standard hardware doesn't have much impact on the market until someone figures out a way to mainstream the newer technology.
That usually means developing the appropriate software support for these exotic processors. And if the goal is to integrate that hardware into a truly heterogeneous system, a la Cray's "Adaptive Computing" vision, it will also involve the much more challenging problem of managing heterogeneity in system software. Our own High-End Crusader addressed this issue just a few weeks ago in the article title "Heterogeneous Processing Needs Software Revolutions."
To its credit, Cray is the only company that has offered a vision of integrated heterogeneous computing, both in hardware and software. But currently it's just a vision, not a product. Even the "Baker" petaflops system they plan to deliver to ORNL in 2008 is a homogeneous Opteron-based machine. Cray will implement their heterogeneous Cascade architecture when and if DARPA selects them for Phase 3 of the HPCS program. But the company says it intend to move forward with their Adaptive Computing roadmap whether they continue with HPCS or not. They believe that the next generation of high performance applications will require a variety of specialized compute engines to obtain reasonable performance (and use reasonable amounts of energy). Cray appears to be committed to that vision.
Other HPC vendors are venturing into the heterogenous space as well. SGI's Reconfigurable Application Specific Computing (RASC) technology represents their advanced FPGA solution. Sun Microsystem's recently deployed TSUBAME supercomputer incorporates ClearSpeed coprocessors (not in use yet, however) as part of that system. Other OEMs may come out with their own solutions in the next few years as software libraries and programmer development environments that support these new processor types become available.
But a heterogeneous architecture "revolution" seems unlikely while homogeneous multi-core architectures are so dominant in the commercial space. Revolutions usually start because the masses are unhappy, and that is not the case today. An "evolution" is far more likely and it is currently in progress. The mainstreaming of heterogeneous systems will happen sooner or later because parallelism, itself, has its limits. Bandwidth, memory access, and software scalability are already inhibiting performance on even moderately scaled systems (thousands of processors). Once we start building petaflops machines, these limitations will become even more aggravating. Heterogenous computing offers a way forward. Join the evolution!
-----
As always, comments about HPCwire are welcomed and encouraged. Write to me, Michael Feldman, at editor@hpcwire.com.
Posted by Michael Feldman - July 14 @ 12:00AM
(Digg, Technorati, more)
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
Michael Feldman is the editor of HPCwire.
More Michael Feldman
HPC? not so much by ewahl
Re: Podcast: A Trio of HPC Apps by sibat0705
Re: Podcast: A Trio of HPC Apps by sibat0705
Re: Cray Corrals Big Defense Deal by watchesuk
We think by watchesuk
Re: IBM and HPC by truly64
HPC = servers but a lot more by lawries
Lena by Nastyanna
Lena by Nastyanna
Multi core deployment becomes a memory game by truly64
Re: Venture Capital Drought? Not So Much. by Ron Van Holst
Re: AMD Confirms 12-Core Opteron Production by Nastyanna
Re: Cray Corrals Big Defense Deal by Nastyanna
Re: Podcast: Cray Awarded Defense Deal; SGI Makes Storage Buy; IBM Invents New Algorithm by Nastyanna
Painful Truth by jeffrey.mcallister
SGI = graphics + HPC by johnbarr
HPC = servers but a lot more by truly64
Oracle SPARC != Fujitsu SPARC by Alan M. Feldstein
Sun & HPC != Oracle & HPC by Merblich
a third vendor for lossless low latency 10GbE fabric by lee.fisher@hp.com
Response to GAH by KevinButerbaugh
Response to KevinButerbaugh by GAH
Response to KevinButerbaugh by GAH
Response to GAH by KevinButerbaugh
Response to bdrupp by KevinButerbaugh
Climate Crisis and Exaflops by bdrupp
Climate Crisis and Exaflops by John Hules
Climate Crisis and Exaflops by GAH
Climate Crisis by KevinButerbaugh
IBM "Brain Simulation" article is not properly presented. by Merritt
563 out of 1206 by vvolkov
Little Iron by gadunk
At least it's not "cloud" by KevinButerbaugh
Native QPI Interface? by commike
Mmmmmm by hellcats
New transistorized IC chip scales. by symmecon
Itanium at IDF by Alan M. Feldstein
Communication time by jnapper
"The financial meltdown and computing" by donpellegrino
Human Models by mdgabriel
High-End SPARC Chip for Scientific Applications by Alan M. Feldstein
RapidMind by Mr LolO
Rapidmind by dminor
Longer run times by JohnWest
re: Algo trading Angst by jshore
Results of Testing by in_the_crease
Right on schedule, Intel has launched its Xeon 5600 processors, codenamed "Westmere EP." The 5600 represents the 32nm sequel to the Xeon 5500 (Nehalem EP) for dual-socket servers. Intel is touting better performance and energy efficiency, along with new security features, as the big selling points of the new Xeons.
Read More...
The ACM Turing Award goes to the creator of the modern personal computer; and Voltaire announces a mid-range InfiniBand switch and new technology that accelerates distributed applications. We recap those stories and more in our weekly wrapup.
Read More...
The prospects for virtual SMP technology got another boost last month when Florida State University announced it had installed a new HPC system from 3Leaf Systems. The servers are being housed at the university's HPC facility and will be used across a range of scientific disciplines.
Read More...
Mar 16 | Bio-IT World | Biotech firm builds genetic models from patient data. Read more...
Mar 15 | The Register | EMC's grand vision for unified global storage. Read more...
Mar 15 | Data Center Knowledge | Company delivers UCS-container solution to NASA. Read more...
Mar 11 | Linux Magazine | CUDA may be the rage, but OpenCL is a standard that has some features you may need. Read more...
Mar 09 | Free Software Magazine | Data-driven computing will need open software. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html