The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
June 16, 2006
In the second part of our interview with Michael Levine and Ralph Roskies, the two scientific co-directors at the Pittsburgh Supercomputing Center (PSC), they talk about the LeMieux computer, the PSC approach to supercomputing, and the challenges that lie ahead for the center. To read part one of the interview, where they talked extensively about the center's Cray XT3 system, go to http://www.hpcwire.com/hpc/686730.html.
HPCwire: LeMieux, your 3,000-processor six-teraflop HP system, which came into service in 2001, was the first NSF terascale system and for several years was the most powerful system available to NSF researchers. At soon-to-be five-years old, it's still one of the most used TeraGrid resources. What are your plans for this system and how much longer can it be useful?
Levine: It can be useful for a very long time. It's a question of how long it will continue to be cost effective. If Moore's Law holds, the amount of computing you can get from initial dollar capitalization keeps improving. On a monthly cost basis, this is a matter of maintenance costs. Likewise with the amount of computing per watt. Power is a large cost factor.
Roskies: At some point, it will no longer be cost effective and we will by then have transitioned the users to the XT3. No one will be left hanging.
Levine: Technically, LeMieux turned out to be a very good machine and continues to be a very good machine, very useful; it's not at a breaking point in any serious sense.
HPCwire: PSC has gained a reputation for its ability to take the leap with new technologies and transform them quickly into productive research tools. Going back to the CRAY Y-MP through half-a-dozen systems up to the XT3, you've received early, if not the first, models of new systems. What are the advantages of this approach? Are there disadvantages?
Roskies: The advantage is the payoff to the scientific community - because new machines will soon enough be sunsetted, as determined by the pace of technological development. So if you can get machines early in their cycle, it means you can use them longer. The earlier you get it, the more science you can get done in the useful lifetime of that machine.
Levine: Also, you bring that capability to the scientific community earlier. You could, of course, wait to introduce any new system into the open research community until it's more mature. But we can get productive use out of this early period, which means it's producing science that much sooner. And it allows us to have more influence with the vendors for the course of development of the system and its application to the NSF research community. This has certainly been the case for our involvement with the XT3 at the Sandia stage.
Roskies: The disadvantage is that there's more work by our systems staff than if we simply waited until the bugs get worked out. The machine would be better understood and it would be less effort to make it available. Of course, we're a major force in making it better understood, so not only are we improving things for our own users, we're improving it for everybody else's XT3 users. Somebody would have to discover these bugs. You can't avoid them.
A benefit to PSC is the cumulative aggregation of knowledge and experience that our staff gain in the process of birthing new systems, over and over, with various vendors and architectures.
Page: 1 of 3(Digg, Technorati, more)
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
Mar 11 | Linux Magazine | CUDA may be the rage, but OpenCL is a standard that has some features you may need. Read more...
Mar 09 | Free Software Magazine | Data-driven computing will need open software. Read more...
Mar 09 | Bio-IT World | Tahoe Informatics founder eyes GPUs, CUDA software. Read more...
Mar 08 | Sporting Life | Formula One engineers differ on benefits of CFD. Read more...
Mar 08 | InfoWorld | AMD offers up 48-core server prize. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html