Cray is listening to its customers about their pain points, says Cray President and CEO Peter Ungaro. A presentation by Ungaro is usually an open and relaxed talk interspersed with humor, interesting insights, and a long-term view. He did not disappoint attendees of the 8th LCI International Conference on High Performance Clustered Computing. The focus of his May 17 keynote was “From Beowulf to Cray-o-wulf: Extending the Linux Clustering Paradigm to Supercomputing Scale.”
Ungaro focused on the differences between Beowulf clusters, based entirely on commodity components, and what he termed “Cray-o-wulf” systems, ones that use many commodity components and a few custom components to deliver systems with much higher performance, reliability and manageability at scale. He presented the basic market realities of supercomputing: commodity processors will become primarily focused on scalability, and the proliferation of multicore processors with stagnant single core performance will continue over the next few years. These trends have generated renewed interest in novel processing architectures and accelerator technologies, an area in which Cray has an established reputation and expertise.
In developing its latest systems, Cray began by recognizing its customers' pain points, namely that big clusters were hitting limitations in the areas of power, cooling and floor space; interconnect performance was a major bottleneck; and user and programmer productivity were suffering as a result of system complexity. A major concern for customers is reliability-availability-serviceability (RAS), which is becoming especially difficult at large scale. Many commodity clusters are experiencing daily failures, an observation that was confirmed by other presentations at the LCI conference.
Ungaro stated that today there are few good storage and data management options, nor are there reasonable options for accelerator support, all of which are important to customers as their systems scale ever larger. Cray is bringing its experience in designing and developing high-productivity computer systems to the commodity-driven cluster market, with a number of new and innovative directions and insight.
“Cray's design is based on the premise that pure commodity clusters begin to break down in a number of ways after about 1,000 processors,” said Ungaro. “Beowulf commodity clusters did a great job of getting us a low-cost solution to scale past what we could do with SMP technologies, getting us from hundreds of processors upwards of 1,000. But above the 1,000 processor limit, the pure commodity approach breaks down and does not provide productive, operational or maintainable systems.”
The Cray-o-wulf approach is based on the convergence of a number of technologies and capabilities that Cray has expertise in, or is in the process of developing. The foundation for this is the company's Adaptive Supercomputing framework, a tightly-coupled integration of hardware and software based on high-availability building blocks. The framework targets capability supercomputing, a market in which customers have more complex applications and higher performance requirements than most mainstream HPC users. Cray's Adaptive Supercomputing vision leverages the company's strength in supercomputing, but aims to broaden the market for its technologies by combining multiple processing architectures into a single, scalable system. Cray's premise is that making supercomputing easier to use will draw in a new set of users that require higher sustained performance from their applications.
A critical component of Cray's system architecture is its proprietary high-bandwidth, low-latency interconnect. The network is highly resilient in the face of transient errors, whereas other network technologies just drop the packets and pay a retransmission performance penalty. Ungaro noted that proprietary interconnects account for only 42 percent of system fabrics on the current Top500 list. But If one narrows it down to the Top 50, then proprietary interconnects account for approximately 77 percent of the systems. In a February 2006 ComputerWorld article, NCAR's James Hack observed: “As scientific computing migrated toward commodity platforms, interconnect technology, both in terms of bandwidth and latency, became the limiting factor on application performance and continues to be a performance bottleneck.”
In Cray's case, while many of its machines are built from available commodity multicore processor and standard memory chips, the specialized interconnect technology is what fundamentally differentiates a Cray-o-wulf from a Beowulf cluster.
Cray envisions multiple commodity and specialized processor technologies based on scalar x86/64 (such as the AMD Opteron Cray uses today in its popular XT4 systems), vector processors (which Cray has been well-known for), multithreaded processors (to address some of the new application areas which aren't as floating point intensive but use novel algorithms such as graphs) and exotic hardware accelerators (such as FPGAs, which were used in the Cray XD1, or GPUs).
Ungaro also hinted at the possibility of combining these various processing technologies together in a future adaptive processor. This implementation of an adaptive processor creates an integrated hybrid supercomputer architecture that allows the diverse user community to choose the processor technologies that meet their computing needs of attaining higher sustained performance on their applications, not just on benchmarks or peak performance metrics.
Cray's Adaptive Supercomputing model will also address the software requirements of an adaptive supercomputer with the requisite ultra-lightweight Linux operating system along with libraries, tools, compilers, and a scalable runtime environment. All of this leads to a transparent interface to the application developer and user. Ungaro mentioned that Cray decided to move its future systems from a proprietary UNIX operating system to Linux to provide its customers access to the many applications and tools that are available.
When Ungaro was asked where Cray views itself being competitive in the marketplace, he responded, “We see our technology providing the most value to customers at systems sizes of about 1,000 sockets and up, at the high-end of the HPC market. Recently we've been very successful with systems in the 2,500 and up socket range, as scaling operational systems to that level is nearly impossible for commodity clusters. We are actively working on technologies that continue to bring the cost of our systems down while not sacrificing any of the performance, reliability, and total cost of ownership advantages of our traditional systems. This is beginning to attract a much broader base of customers to Cray systems, not just in the research and earth sciences communities, but increasingly in new segments of the scientific research market and in industrial segments such as life sciences, automotive and aerospace.”
About the Author
Allan Torres began his career at Cray Research in 1979, when he was hired as an I/O Design and Prototype Engineer in Chippewa Falls, WI. He is the founder of The Torres Group, LLC, an independent consultant in the high performance computing, networking and storage markets. To learn more about Allan, visit http://oftheuniverse.com/index.html.