CLUSTERED HPC: AN INTERVIEW WITH THOMAS STERLING

November 8, 2000

by Alan Beck, editor in chief LIVEwire

Dallas, Texas — Thomas Sterling holds a joint appointment with NASA’s Jet Propulsion Laboratory (JPL) and the California Institute of Technology (CalTech), serving as Principle Scientist in JPL’s High Performance Computing group and Faculty Associate in CalTech’s Center for Advanced Computing Research. He received his Ph.D. as a Hertz Fellow from MIT in 1984.

For the last 20 years, Sterling has engaged in applied research in parallel processing hardware and software systems for high-performance computing. He was a developer of the Concert shared-memory multiprocessor, the YARC static dataflow computer, and the Associative Template Dataflow computer concept, and has conducted extensive studies of distributed shared-memory cache-coherent systems. In 1994, Sterling led the NASA Goddard Space Flight Center team that developed the first Beowulf-class PC clusters. Since 1994, he has been a leader in the national Petaflops initiative. He is the Principal Investigator for the interdisciplinary Hybrid Technology Multithreaded (HTMT) architecture research project sponsored by NASA, NSA, NSF, and DARPA, which involves a collaboration of more than a dozen cooperating research institutions. Dr. Sterling holds six patents, and was one of the winners of the 1997 Gordon Bell Prize for Price/Performance.

Sterling gave a state of the field talk on COTS Cluster Systems for High-Performance Computing at SC2000; HPCwire talked with him to obtain a better perspective on his views:

HPCwire: Your work in clustered supercomputing has literally revolutionized HPC in the last few years. But surely there is a limit to what is possible for this type of technology — or is there? What are the most serious factors currently circumscribing the capabilities of clustered HPC? Are any solutions on the horizon?

STERLING: The rate of growth in numbers, scale, and diversity of the implementation and application of clusters in HPC including (but not limited to) Beowulf-class systems has been extraordinary. But my work with Don Becker on the early Beowulf systems succeeded in no small part because of much previous and continuing good work accomplished by many others in the distributed computing community in hardware and software systems. Workstation clusters (e.g. COW, NOW), message-passing libraries (e.g. PVM, MPI), operating systems (e.g. BSD, Linux), middleware (e.g. Condor, Maui Scheduler, PBS, the Scyld scalable cluster distribution), and advanced networking (e.g. Myrinet, QSW, cLAN) are only a few examples of the ideas, experiences, and components that contributed to the synthesis of Beowulf-class PC clusters and continue to push cluster computing forward at an accelerating rate. And driving all of that enabling technology is the computational scientists adopting their distributed application algorithms to the not always friendly operational properties of successive generations of Beowulf platforms.

It has been a pleasure to play a role in the Beowulf phenomenon but it is the accomplishment of many, not just a few. Many government organizations have contributed to this including a number of NASA and DOE labs with valuable tools disseminated to the community as open source software by some of them (e.g. Argonne, Oakridge, Goddard, Ames). And this is being paralleled by the more recent work in large NT-based clusters of PCs as well (e.g. at NCSA, CTC, UCSD). Of course now the field of Beowulf computing has matured such that it is partnered with industry, large and small, in hardware (e.g. Compaq, IBM, VA Linux, HPTI, Microway) and software (e.g. Turbo Linux, Redhat, SuSE, Scyld) providing improved functionality, performance, and robustness at reasonable (usually) cost. As a result, many tasks in academia, industry, government, and commerce are now performed on this class of systems providing a stable architecture family for both ISV and applications programmers to target with confidence while riding the Moore wave through future generations of advanced technology. Indeed, many of our computer science students have their first experiences with parallel computing on small Beowulfs.

How far clusters in general and Beowulf-class systems in particular can go is a tantalizing question. The challenges today may be seen in three dimensions: 1) bandwidth and latency of communications, 2) usability and generality of system environments, and 3) availability and robustness for industrial grade operation. The first is now being addressed by industry perhaps starting with the pathfinding work of Chuck Seitz with Myrinet. Improvements in both latency and bandwidth by one and two orders of magnitude over the baseline Fast-Ethernet LAN are being achieved with such consortium drivers as VIA and Infiniband. Bandwidths beyond 10 Gbps and real latencies approaching a microsecond are on the horizon as zero-copy software and optical channels become mainstream for future system area networks.

A number of groups in the US, Japan, and Europe are developing tools to establish acceptable environments for managing, administering, and applying these systems to real-world workloads. This will take some time to shake out, although significant progress is finally being made. Various efforts to collect representative tools in to usable distributions (e.g. Oscar, Scyld, Grendel) and make them available are involving collaborations across many institutions. While such systems may never be easy to program or truly transparent or seamless in their supervision, they may prove sufficient within the bounds of practical necessity.

Finally, the issue of reliability is one that appears to vary dramatically. One hears horror stories of nodes dying every few hours and others where complete systems stay up for more than half a year. At Caltech our Beowulf “Naegling” has had a worst case node failure within 80 days and a best failure time of almost 200 days. This is after surviving the usual burn-in period. Infant mortality is always part of the experience and certain types of components (e.g. fans, power supplies, disks, NICs) tend to experience some fatality within the first few weeks. Then the systems stabilize. A similar process occurs with the software environments; bugs in the installation and configuration are exposed early on and have to be eliminated one by one, sometimes painfully. But industry investment in the mass market nodes and networks and their recent efforts in system integration are showing results in improving availability and robustness. More work is needed in limiting the down time of a system when an individual component dies. There are severe challenges in even detecting when wrong results are produced, although a system keeps running. These are expected to receive increasing attention as a real market, especially in commerce, is found for systems as large as thousands of processors.

We are approaching the milestone (albeit somewhat arbitrary) of being able to assemble a Teraflops-scale Beowulf-class system for one million dollars. But the cost of running and maintaining such a system is non-trivial and has to be accounted for. And industry (e.g. Sun, Compaq, SGI, IBM) is playing an increasingly important role in making such systems accessible. Another area that is lagging is that of distributed mass storage and generalized parallel file servers. Systems oriented around the storage and fetching of mass data sets is likely to drive the commercial customer base for clusters and play an important role in scientific computing as well. While some early systems are being employed (e.g. PPFS, PVFS), much work has yet to be done in this area. With system on a chip (SOC) technology allowing multiple processors and their integrated caches to be implemented on a single die and clock rates slowly increasing through the single-digit GHz regime, performance density is likely to continue to advance at a steady pace. Will we see a Petaflops Beowulf by 2010 as possibly implied by the Top500 list? It is not out of the question, although personally I hope we find a better way. Beowulf was always about picking the low hanging fruit and has consistently shown that where there is a way, there is a will.

HPCwire: Within the last year several firms have emerged that are solely focused upon exploiting computing power from large networks of Internet-connected PCs. How do you view these efforts? What will ultimately determine the success or failure of such ventures?

STERLING: This is a new frontier in distributed computing and one based on the perceived opportunity of an untried business model. What I call “cottage computing” is unique and has no analogue in other domains of economy or production (that I can think of) since the beginnings of the industrial revolution in the mid 18th century. The [email protected] experience is tantalizing and stimulates consideration of broader application that is driving these new enterprises. But I am extremely uncertain of the outcome. It will ultimately be determined by the complex interplay of factors including the difficulties of achieving adequate security in both directions, the relative value of diffuse computing cycles, and the competing alternatives. While I am not yet convinced of a favorable outcome, this is an exciting process with some very sharp people heavily engaged. Its evolution will be very interesting to watch over the next 18 months.

HPCwire: As Principal Investigator for the interdisciplinary Hybrid Technology Multithreaded (HTMT) architecture research project, you have a unique insight into the characteristics of these fascinating technologies. Please share some of your thoughts and observations with us.

STERLING: The multi-institution, interdisciplinary HTMT architecture research project is a four-year effort to explore a synthesis of alternative technologies and architectural structures to enable practical general-purpose computing in the trans-Petaflops regime. The genesis of this advanced exploratory investigation was catalyzed by the initial findings of the National Petaflops Initiative, a community-wide process, and is aligned with the strong recommendations of the President’s Information Technology Advisory Committee (PITAC) on high performance computing research directions. Through HTMT, significant insights have been acquired revealing the potential of aggressively exploiting non-conventional strategies to achieving ultra-scale performance. Perhaps the most important was the value of inter-relating system structure and disparate technologies to accomplish a synergy of complementing technology characteristics. Much of the public attention and controversy has been on the technologies themselves, which pushed the capability of logic speed, storage capacity, and communications throughput to extremes.

While the project was not committed to any particular device, it studied specific example technologies in detail, in some cases contributing to their advancement. Among these, the innovative packet switched Data Vortex optical network exploiting both time division and wave division multiplexing may have near term impact for a wide range of high-end systems. Optical holographic storage was shown to provide one possible means of providing a high density, high throughput memory layer between primary and secondary storage. The merger of semiconductor DRAM cell and CMOS logic was shown to enable Processor in Memory (PIM) smart memory structures that may make possible new relationships between high speed computer processors and their memory hierarchy imbued with extended functionality. The most controversial aspect of the project was its investigation of superconductor rapid single flux quantum (RSFQ) logic. Conventional wisdom dictates that previous experiences by IBM and within Japan demonstrated that computers built from superconductor electronics were infeasible while the cooling requirements made it impractical.

The findings of the HTMT project are that Niobium based RSFQ logic is both feasible and practical and affords unique opportunities in the design of very high-speed processor design between 50 GHz and 150 GHz clock rates. However, within the constraints of existing fabrication facilities and industrial/government investment, the likelihood of realizing such components is remote. Even more significant than the technologies within HTMT is the architecture that would incorporate them. HTMT explored the potential of a dynamic adaptive resource management methodology called “percolation” that employs smart memories to determine when tasks are to be performed and to pre-stage all information related to task execution proactively using low cost in-memory logic. The conclusion is that such small processors can remove the combined problems of overhead and latency from the main processors while performing many of the low-locality data intensive operations in the memories themselves. The result would be very high efficient operation even on those algorithms that have proven difficult to optimize in the past. The overall result of the HTMT project is the strong opportunities for increasing investments in high performance computer system research to benefit from significant potential benefits as yet not exploited.

HPCwire: What are the most important issues facing HPC today? What are the best ways those within the community can pursue creative solutions?

STERLING: The dominant strategic issues today are: first, is HPC important, and second, must all future HPC systems be limited to COTS clusters and their equivalents. While the first issue may appear silly to some, there is a real threat to HPC and supercomputing as a goal and discipline with some respected colleagues publicly stating that performance as a research goal is no longer important. This is in part driven by the excitement about the potential of the Internet, Web, and Grids that are perceived as a more attractive and lucrative area of pursuit than HPC systems development. There is also an apparent malaise derived from the perception of a small and shrinking HPC market, the Moore’s law juggernaut, lack of funding, the diminishing glamour, and the poor track record of such research in the past. For this reason, where HPC is really needed, both industry and academia in many cases perceive clusters including, but not limited to, Beowulf-class systems to be an easy, relatively low cost out, with short-term difficulties to be rectified to an adequate degree, it is presumed, by future developments in distributed system software.

Our work with Beowulfs has shown us that in many cases this is an acceptable solution and that the contributions being made by Becker and many others will reduce, although probably not close, the gap between cheap hardware and needed user environments. But my work on HTMT and Petaflops scale computing has revealed both the need for and opportunity of devising innovative new structures for attacking major computational challenges at performance levels orders of magnitude beyond that which is being implemented today. The early work by IBM on its BlueGene project is suggesting the same conclusions. Problems of controlled fusion, molecular protein folding and drug design, high confidence climate modeling, complex aerospace vehicle design optimization, and brain modeling while perhaps not as enticing as real-time video games and e-commerce would nonetheless revolutionize human existence in the 21st Century.

From a technical perspective, the dual challenges of good price-performance for scalable systems and latency management for acceptable system efficiency are matched by the more vague goals of programmability and generality. These are nothing new in the field of parallel processing but their impact is of increasing significance as system scale extends beyond 10,000 coarse-grain nodes (e.g. ASCI White) or even a million fine grain nodes (e.g. IBM BlueGene) and as more complex interdisciplinary applications are pursued. Cost is important and the need to devise structures that can be realized at low cost, other than COTS cluster techniques, is critical. PIM is one very real possibility here but the architectures, while retaining simplicity, must be advanced well beyond current examples.

In my view (and others may disagree), dynamic adaptive resource methodologies, most likely exploiting PIM smart memories, may address the key problems of latency (perhaps through percolation), overhead, and load balancing while simplifying both hardware and software development. But in the long term even as it pains me to say so, I see a need for advanced parallel languages that are not constrained by assumptions of conventional underlying hardware components and organizations. I believe a new decision-tree model for resource management is required; one that revises the notion of what does a computer know and when does it know it in making the determination of resource to task allocation in time and space. These questions are both significant and tantalizing. It only remains to the combined high performance research community to revitalize its commitment to their pursuit and ultimate resolution.

HPCwire: How would you characterize the current interrelationship between national policy, corporate policy, and leading-edge HPC research? Should this be modified? If so, how?

STERLING: It is difficult to characterize “national policy” as it pertains to HPC research. The PITAC recommendations on future directions in HPC research were clear and specific and I adhere to them both in principal and in their explicit proposed actions. These recommendations were not addressed by the Federal agencies for FY01 although many other important areas in IT considered by PITAC did receive attention. There is real interest in many quarters to do so, but at the moment, aggressive pursuit of these ideals remains dormant. Corporate policy quite reasonably focuses on the sweet spot of the market and the Cluster approach lends well to this strategy, providing a degree of scalability without investing in unique systems for the high end. The risks are too high for industry alone to attack them while the perception is that the market is too small to provide adequate financial return. My guess is that the latent market is much greater but not at the price-performance point of the older supercomputers and MPPs. Of course, many applications today routinely run at performance levels on the desktop that supercomputer applications consumed a decade ago. That should be a strong signal that the opportunities for much greater performance systems are plenty. However, the community either has not gotten the hint or rather they use the same experience to justify waiting: Petaflops will come to those who wait; Moore or less.

From my previous comments, yes, I believe the apparent policies and interrelationships should be modified. The partnership between national policy and corporate policy in HPC research should be one of mutual and complementing strengths. The DOE ASCI program performed well in working with industry to development pace setting high performance systems through the extension of conventional means and in so doing demonstrated the value of advanced capability systems for exploring the frontiers of science and technology through computation. But no counter balancing non-incremental advanced research of significance has been undertaken or sponsored to explore over-the-horizon regimes. Yes, quantum computing and other exotic forms of processing are being supported under basic research. But there is a major gap between these and today’s conventional distributed systems. I would like to see the PITAC recommendations in HPC carried out and that a partnering between industry and government be developed involving that academic community to explore innovative opportunities and reduce the risk so that a truly new class of parallel computer system can emerge to escape the current cul-de-sac in which we are trapped and deliver a revolutionary new tool with which to build the world habitat of the 21st century.

============================================================

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

SODALITE: Towards Automated Optimization of HPC Application Deployment

May 29, 2020

Developing and deploying applications across heterogeneous infrastructures like HPC or Cloud with diverse hardware is a complex problem. Enabling developers to describe the application deployment and optimising runtime p Read more…

By the SODALITE Team

What’s New in HPC Research: Astronomy, Weather, Security & More

May 29, 2020

In this bimonthly feature, HPCwire highlights newly published research in the high-performance computing community and related domains. From parallel programming to exascale to quantum computing, the details are here. Read more…

By Oliver Peckham

DARPA Looks to Automate Secure Silicon Designs

May 28, 2020

The U.S. military is ramping up efforts to secure semiconductors and its electronics supply chain by embedding defenses during the chip design phase. The automation effort also addresses the high cost and complexity of s Read more…

By George Leopold

COVID-19 HPC Consortium Expands to Europe, Reports on Research Projects

May 28, 2020

The COVID-19 HPC Consortium, a public-private effort delivering free access to HPC processing for scientists pursuing coronavirus research – some utilizing AI-based techniques – has expanded to more than 56 research Read more…

By Doug Black

What’s New in Computing vs. COVID-19: IceCube, TACC, Watson & More

May 28, 2020

Supercomputing, big data and artificial intelligence are crucial tools in the fight against the coronavirus pandemic. Around the world, researchers, corporations and governments are urgently devoting their computing reso Read more…

By Oliver Peckham

AWS Solution Channel

Computational Fluid Dynamics on AWS

Over the past 30 years Computational Fluid Dynamics (CFD) has grown to become a key part of many engineering design processes. From aircraft design to modelling the blood flow in our bodies, the ability to understand the behaviour of fluids has enabled countless innovations and improved the time to market for many products. Read more…

Supercomputer Simulations Explain the Asteroid that Killed the Dinosaurs

May 28, 2020

The supercomputing community has cataclysms on the mind. Hot on the heels of supercomputer-powered research delving into the fate of the neanderthals, a team of researchers used supercomputers at the DiRAC (Distributed R Read more…

By Oliver Peckham

COVID-19 HPC Consortium Expands to Europe, Reports on Research Projects

May 28, 2020

The COVID-19 HPC Consortium, a public-private effort delivering free access to HPC processing for scientists pursuing coronavirus research – some utilizing AI Read more…

By Doug Black

$100B Plan Submitted for Massive Remake and Expansion of NSF

May 27, 2020

Legislation to reshape, expand - and rename - the National Science Foundation has been submitted in both the U.S. House and Senate. The proposal, which seems to Read more…

By John Russell

IBM Boosts Deep Learning Accuracy on Memristive Chips

May 27, 2020

IBM researchers have taken another step towards making in-memory computing based on phase change (PCM) memory devices a reality. Papers in Nature and Frontiers Read more…

By John Russell

Hats Over Hearts: Remembering Rich Brueckner

May 26, 2020

HPCwire and all of the Tabor Communications family are saddened by last week’s passing of Rich Brueckner. He was the ever-optimistic man in the Red Hat presiding over the InsideHPC media portfolio for the past decade and a constant presence at HPC’s most important events. Read more…

Nvidia Q1 Earnings Top Expectations, Datacenter Revenue Breaks $1B

May 22, 2020

Nvidia’s seemingly endless roll continued in the first quarter with the company announcing blockbuster earnings that exceeded Wall Street expectations. Nvidia Read more…

By Doug Black

Microsoft’s Massive AI Supercomputer on Azure: 285k CPU Cores, 10k GPUs

May 20, 2020

Microsoft has unveiled a supercomputing monster – among the world’s five most powerful, according to the company – aimed at what is known in scientific an Read more…

By Doug Black

HPC in Life Sciences 2020 Part 1: Rise of AMD, Data Management’s Wild West, More 

May 20, 2020

Given the disruption caused by the COVID-19 pandemic and the massive enlistment of major HPC resources to fight the pandemic, it is especially appropriate to re Read more…

By John Russell

AMD Epyc Rome Picked for New Nvidia DGX, but HGX Preserves Intel Option

May 19, 2020

AMD continues to make inroads into the datacenter with its second-generation Epyc "Rome" processor, which last week scored a win with Nvidia's announcement that Read more…

By Tiffany Trader

Supercomputer Modeling Tests How COVID-19 Spreads in Grocery Stores

April 8, 2020

In the COVID-19 era, many people are treating simple activities like getting gas or groceries with caution as they try to heed social distancing mandates and protect their own health. Still, significant uncertainty surrounds the relative risk of different activities, and conflicting information is prevalent. A team of Finnish researchers set out to address some of these uncertainties by... Read more…

By Oliver Peckham

[email protected] Turns Its Massive Crowdsourced Computer Network Against COVID-19

March 16, 2020

For gamers, fighting against a global crisis is usually pure fantasy – but now, it’s looking more like a reality. As supercomputers around the world spin up Read more…

By Oliver Peckham

[email protected] Rallies a Legion of Computers Against the Coronavirus

March 24, 2020

Last week, we highlighted [email protected], a massive, crowdsourced computer network that has turned its resources against the coronavirus pandemic sweeping the globe – but [email protected] isn’t the only game in town. The internet is buzzing with crowdsourced computing... Read more…

By Oliver Peckham

Global Supercomputing Is Mobilizing Against COVID-19

March 12, 2020

Tech has been taking some heavy losses from the coronavirus pandemic. Global supply chains have been disrupted, virtually every major tech conference taking place over the next few months has been canceled... Read more…

By Oliver Peckham

DoE Expands on Role of COVID-19 Supercomputing Consortium

March 25, 2020

After announcing the launch of the COVID-19 High Performance Computing Consortium on Sunday, the Department of Energy yesterday provided more details on its sco Read more…

By John Russell

Supercomputer Simulations Reveal the Fate of the Neanderthals

May 25, 2020

For hundreds of thousands of years, neanderthals roamed the planet, eventually (almost 50,000 years ago) giving way to homo sapiens, which quickly became the do Read more…

By Oliver Peckham

Steve Scott Lays Out HPE-Cray Blended Product Roadmap

March 11, 2020

Last week, the day before the El Capitan processor disclosures were made at HPE's new headquarters in San Jose, Steve Scott (CTO for HPC & AI at HPE, and former Cray CTO) was on-hand at the Rice Oil & Gas HPC conference in Houston. He was there to discuss the HPE-Cray transition and blended roadmap, as well as his favorite topic, Cray's eighth-gen networking technology, Slingshot. Read more…

By Tiffany Trader

Honeywell’s Big Bet on Trapped Ion Quantum Computing

April 7, 2020

Honeywell doesn’t spring to mind when thinking of quantum computing pioneers, but a decade ago the high-tech conglomerate better known for its control systems waded deliberately into the then calmer quantum computing (QC) waters. Fast forward to March when Honeywell announced plans to introduce an ion trap-based quantum computer whose ‘performance’ would... Read more…

By John Russell

Leading Solution Providers

SC 2019 Virtual Booth Video Tour

AMD
AMD
ASROCK RACK
ASROCK RACK
AWS
AWS
CEJN
CJEN
CRAY
CRAY
DDN
DDN
DELL EMC
DELL EMC
IBM
IBM
MELLANOX
MELLANOX
ONE STOP SYSTEMS
ONE STOP SYSTEMS
PANASAS
PANASAS
SIX NINES IT
SIX NINES IT
VERNE GLOBAL
VERNE GLOBAL
WEKAIO
WEKAIO

Contributors

Fujitsu A64FX Supercomputer to Be Deployed at Nagoya University This Summer

February 3, 2020

Japanese tech giant Fujitsu announced today that it will supply Nagoya University Information Technology Center with the first commercial supercomputer powered Read more…

By Tiffany Trader

Tech Conferences Are Being Canceled Due to Coronavirus

March 3, 2020

Several conferences scheduled to take place in the coming weeks, including Nvidia’s GPU Technology Conference (GTC) and the Strata Data + AI conference, have Read more…

By Alex Woodie

Exascale Watch: El Capitan Will Use AMD CPUs & GPUs to Reach 2 Exaflops

March 4, 2020

HPE and its collaborators reported today that El Capitan, the forthcoming exascale supercomputer to be sited at Lawrence Livermore National Laboratory and serve Read more…

By John Russell

Cray to Provide NOAA with Two AMD-Powered Supercomputers

February 24, 2020

The United States’ National Oceanic and Atmospheric Administration (NOAA) last week announced plans for a major refresh of its operational weather forecasting supercomputers, part of a 10-year, $505.2 million program, which will secure two HPE-Cray systems for NOAA’s National Weather Service to be fielded later this year and put into production in early 2022. Read more…

By Tiffany Trader

‘Billion Molecules Against COVID-19’ Challenge to Launch with Massive Supercomputing Support

April 22, 2020

Around the world, supercomputing centers have spun up and opened their doors for COVID-19 research in what may be the most unified supercomputing effort in hist Read more…

By Oliver Peckham

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

15 Slides on Programming Aurora and Exascale Systems

May 7, 2020

Sometime in 2021, Aurora, the first planned U.S. exascale system, is scheduled to be fired up at Argonne National Laboratory. Cray (now HPE) and Intel are the k Read more…

By John Russell

TACC Supercomputers Run Simulations Illuminating COVID-19, DNA Replication

March 19, 2020

As supercomputers around the world spin up to combat the coronavirus, the Texas Advanced Computing Center (TACC) is announcing results that may help to illumina Read more…

By Staff report

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This