Visit additional Tabor Communication Publications
November 23, 2007
The SC07 Cluster Challenge was held in conjunction with the SC07 conference in Reno, Nevada. The event sought to create an exhibition and competition in which teams of undergraduate students would compete in a demonstration of talent, technology and accessibility of entry-level supercomputing. The activity was intended to highlight the gains in hardware performance, ease of use of clusters and the power and availability of simulation software.
To whet peoples' appetites, the Cluster Challenge Committee challenged that a half rack of a modern cluster would be competitive with the number one system on the TOP500 from only 10 years ago! In fact, the top Linpack score realized in the Challenge was 420 gigaflops, which would have made the TOP500 list only three years ago. This announcement was made during the TOP500 BoF session at SC07 and was met with loud cheers and applause from the attendees.
The rules of the Challenge were simple: a maximum of 26 amps max power (at 110 volts) were to be used, and no team member may have completed an undergraduate degree. Six teams and their vendor partners chose to compete:
Each team partnered with a vendor who loaned the equipment for the event and, in some cases, provided travel funds to the team. In most cases, the teams had access to the equipment for a few months, but some only had it for a few weeks prior to shipping. On Saturday, Nov. 10, teams arrived to find (or not, in some cases) their equipment waiting in the contest area. Saturday and Sunday were then spent rebuilding systems, optimizing power, and finalizing the benchmarks and applications for the start of the competition.
Teams were asked to run the HPC Challenge benchmarks at the beginning of the event at 8:00 PM on Monday. Once the results were submitted, teams were given access to data sets for the three previously announced applications: GAMESS, POP and POVRay. Teams then spent the remainder of their time -- up to 4:00 PM on Wednesday -- completing as many of the data sets as they could. At the end of the competition, judges interviewed the teams and awarded points based on this interaction. The judges were lead by Jack Dongarra (University of Tennessee and ORNL) and included Dona Crawford (LLNL), Satoshi Matsuoka (Tokyo Tech) and Tim Lyons (Morgan Stanley).
For about 44 straight hours, teams worked on the benchmarks and applications in shifts. About halfway through the event, around Tuesday at noon, there was a general power interruption to the section of Reno where the convention center is located. All teams experienced a hard crash and had to scramble to recover. The event couldn't have asked for a more brutal real-life experience. Some of the teams lost upwards of ten hours of compute time and others lost hardware and time associated with debugging the failure. In the end, however, all teams were back online within a couple of hours and some chose to run with automatic checkpoint restart, as available in some of the applications, to protect against further interruption.
Ultimately, the winner of the contest was the team from The University of Alberta in Edmonton, Canada. While that team did not have the fastest system on paper, a combination of good preparation and good fortune during the brief power outage gave them the advantage.
The quality of teams, systems and computational work exceeded all expectations. While one could attribute this to individuals standing up to the challenge, the committee's opinion is that a larger force is also at work. This is that the entry barrier to supercomputing has dropped significantly. The results show that if you have a need for simulation computing, it is reasonable to believe that you can use local college or university talent and commonly available software and applications to get started on that work.
Another significant outcome of the event is its impact on the curricula of the participating institutions. Half of these schools decided to modify their undergraduate offerings in the future to include cluster and parallel computing classes.
Computational simulation, driven by continued advances in hardware, availability and maturity of cluster OS software, and enabled by parallel application software, has reached a point where it is clearly accessible and available. These tools are now available to industry and we predict the technology will soon be considered critical to enhance competitiveness of businesses of all sizes and in all markets.
In closing, there are numerous people and organizations to credit for the event itself. Starting with the ACM and IEEE (sponsors of SC07), the individual team vendor partners (Dell, ASUSTek, Aspen Systems, SGI, Apple and HP) and the event partners (Chevron, WesternGeco and Morgan). The results are exciting and we are already planning for the next event at SC08, Nov. 15-21, in Austin, Texas.
May 22, 2013 |
At some point in the not-too-distant future, building powerful, miniature computing systems will be considered a hobby for high schoolers, just as robotics or even Lego-building are today. That could be made possible through recent advancements made with the Raspberry Pi computers.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.