This week, Linux-on-Itanium fans convened at the Gelato Itanium Conference and Expo (ICE) in Singapore to talk about platform issues and spotlight success stories. Cameron McNairy, Itanium Processor Architect and Principal Engineer, gave the opening keynote as well as presented a couple of other technical sessions on the microprocessor architecture. HPCwire asked McNairy about Itanium's role in high performance computing, the current maturity of Itanium-based systems, and what we can expect to see in the future.
HPCwire: What do you think the Itanium processor brings to the table that can't be found in other RISC (Power, Sparc) and CISC (x86) architectures?
McNairy: The Itanium processor brings three things to the table: choice, flexibility and performance. Itanium systems are supported by six different OSes, over 10,000 applications and eight major OEMs providing specialized systems for different market segments. Other RISC-based systems are proprietary and don't come close to offering the breadth of solutions that Itanium can. The end result is that Itanium OEMs can invest in hardware systems that deliver more options across a wider range of potential customers. For example, HP has offered 3 different RISC based systems — PA-RISC, Alpha, and NonStop — each with their own associated OS. While you could not get a NonStop kernel on PA-RISC or an OpenVMS on MIPs, HP's Itanium-based Integrity servers are available to serve NonStop, OpenVMS, Linux, Windows, and HPUX customers which is a huge advantage for HP. The dual-core Itanium 2 processor holds four world performance records including a score of 4230 SPEC_int_rate_base_2000, nearly triple the previous record.
HPCwire: What attributes of the Itanium make it particularly well-suited to HPC workloads?
McNairy: The direct support of large memory SMP systems makes Itanium an ideal architecture for the most demanding HPC workloads. With 50-bit physical addressability, Itanium is poised to address and manage petascale memory subsystems. As we enter the petascale era, much must be considered regarding data sets and programmability of HPC systems. Time to solution is of major consideration. Where the application space hasn't had time to accommodate explicit message passing algorithms, Itanium-based SMP systems offer a fast time to solution environment. Combined with its multi-level error containment architecture, Itanium-based SMP systems can avail petascale performance quickly and can have a sufficient MTBI to allow the calculations to actually complete. The HPC arena was one of the key design targets of both the Itanium architecture and the Itanium 2 implementations.
HPCwire: How do you think the system OEMs are doing in exploiting the potential of Itanium?
McNairy: Great, but there is more work to be done. We work closely with our customers and are often amazed at the interesting things they are doing with our processors. For example, the SGI system with large SMP, the NEC, HP and Unisys systems take full advantage of the reliability, scalability and availability features of the architecture. At the same time, we see some of them approaching the processor as a black box leading them to miss out on some of the key capabilities and configurations. Accordingly, we work closely with our customers to help them deliver the best Itanium platforms possible. For example, Intel provides tools and training, along with access to our architects to resolve questions and concerns, and to enable designs. Software is one key element to any system and poor software can make even the best system perform poorly. We see software as a critical link in the Itanium platform chain. In order to achieve critical software momentum and address key software issues, the Itanium Solutions Alliance was formed last year to broaden software development. The Alliance pools the efforts of multiple OEMs, OSVs, and ISVs for added momentum and critical mass.
HPCwire: How would you characterize the current maturity of software support for Itanium (compilers and other development tools, libraries, OS support, applications, etc.)? What are the biggest challenges here?
McNairy: I am really pleased with the maturity of software for Itanium, but again, there is more work to be done. There are over 10,000 key applications that support Itanium today. The compilers are maturing and continue to deliver performance and robustness for Itanium systems. Could software support be even better? Yes. Will it get better? Absolutely. This is a marathon, not a sprint. Software momentum is ramping quickly with multiple innovations coming soon. One of the biggest challenges is scaling operating systems to 32-, 64- and up to 128-processor systems. We want broader focus from the software industry because of the opportunity that exists. We continue to work with this industry to extend their focus.
HPCwire: Can you talk a little bit about future Itanium features and their significance?
McNairy: In the future, Intel is planning on providing new features for the high-end market segment such as new reliability and availability features, multi-core processors and even higher speed interconnects. We will continue to work very closely with hardware and software vendors to better understand their needs while investigating ways to incorporate that input and knowledge into future designs.
HPCwire: A bit of a philosophical question here. There's been increasing talk in the HPC community about heterogeneous systems in which future platforms will be populated by a variety of specialized processors — GPUs, Cell processor, FPGAs, vector processors, FP co-processors, etc. — where each processor has the ability to deal with certain types of code more efficiently than a general-purpose processor. Do you think this is a natural evolution of computer architectures? And if so, where does this leave general-purpose scalar/FP processors?
McNairy: The answer depends on the ability to simplify the transition such that the return on investment is huge. Hardware is certainly easier to change than software and software would need to change to support special program/non-general purpose compute enhancement devices effectively. Thus, I see the current niche continuing — specialty processors abstracted by a dedicated library that the application writer does not know about — until we change the way we think about computational problems and the way we put algorithms to software. I am personally very excited about special purpose computing elements (my Masters thesis was on FPGA FP acceleration using a systolic design). I think the implementation is simple, the challenge lies in successfully abstracting the hardware such that the cost is low enough to produce a return on investment. Will user defined instructions/capabilities make it onto the processor? Certainly, but only in limited use cases until we address the software problem.
—–
Cameron McNairy is a Principal Engineer and an Intel Architect for the Montecito program. Previous to Montecito, Cameron was a micro-architect for the Itanium 2 processor, contributing to its design and final validation. He plans to focus on performance, RAS (reliability, availability, serviceability), and system interface issues in the design of future IPF products. He came to the Itanium 2 team soon after its inception from performance work on the first Itanium processor. Cameron received a BSEE and an MSEE from Brigham Young University. He is a member of the Institute of Electrical and Electronics Engineers.