Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
July 13, 2007

The Week in Review

by John E. West

Here’s a collection of highlights, selected totally subjectively, from this week’s HPC news stream as reported at insideHPC.com and HPCwire.

>>10 words and a link

An IBM first: the AIX6 public beta;
http://insidehpc.com/2007/07/12/an-ibm-first-the-aix6-public-beta/

DataSynapse revs software, supports enterprise grids up to 20k nodes;
http://insidehpc.com/2007/07/10/datasynapse-enabling-commercial-grids-of-up-to-20000-nodes/

DRC announces new Torrenza-enabled FPGA, Cray adopts for adaptive supercomputing;
http://insidehpc.com/2007/07/10/drc-and-cray-working-to-integrate-fpgas-at-hypertransport-speed/

Panasas builds 150TB single rack enterprise storage product;
http://insidehpc.com/2007/07/10/panasas-adds-150-tb-product/

ISC’07 sets new records, over 1200 participants;
http://insidehpc.com/2007/07/10/isc-sets-new-records/

DataDirect adds SATAssure to combat silent data corruption;
http://insidehpc.com/2007/07/10/directdata-networks-fights-silent-sata-corruption/

>>Reed’s teraflops-year code compilation

Dan Reed has a summary on his blog (http://www.renci.org/blog/pivot/entry.php?id=37) of this year’s DOE’s SOS11 workshop, held in Key West. This year’s workshop theme was “Challenges of Sustained Petascale Computation.”

The post is interesting, and Dr. Reed’s perspective is always valuable. Two comments in particular jumped out at me:

“In the vendor session, Intel discussed its 80-core teraflop test chip and some of the electrical signaling issues it was intended to test. Everyone at the workshop (and at Microsoft Manycore Computing Workshop) agreed that we would see hundred-core commodity chips by the end of the decade. Looking further, one can see thousand-core chips coming.”

And then this comment suggesting that as machines get more complex we may find it advantageous to rely more on machines to help us get performance out of our software:

“What I am really arguing is that we need to rethink aggressive machine optimization, virtualization and abstraction. What’s wrong with devoting a teraflop-year to large-scale code optimization? I don’t just mean peephole optimization or interprocedural analysis. Think about genetic programming, evolutionary algorithms, feedback-directed optimization, multiple objective code optimization, redundancy for fault tolerance and other techniques that assemble functionality from building blocks. Why have we come to believe that compilation times should be measurable with a stopwatch rather than a sundial?”

A sundial! Good stuff.

I think this point of view is worth some serious exploration. We’ve spent 40 years working on compiler technology to get it where it is today. Without a disruptive development (which is like winning the lottery: it happens, but hope isn’t a strategy) we’re unlikely to be able to effectively use the new computer hardware the chipmakers are developing for a very long time. We need to be seriously exploring a variety of new avenues. This one seems low risk, and would make a good anchor for a more aggressive NSF compiler technology investment portfolio.

>>SDSC’s on demand HPC

SDSC is trying something uncommon with HPC: allocated real-time access to support “event-driven” science. One example of how the system would be used is to support analysis in the immediate aftermath of an earthquake.

“When an earthquake greater than magnitude 3.5 strikes Southern California, typically once or twice a month, [Caltech computational seismologist Jeroen] Tromp expects that his simulation code will need to use 144 processors of the OnDemand system for about 28 minutes. Shortly after the earthquake strikes a job will automatically be submitted and immediately allowed to run. The code will launch and any ‘normal’ jobs running at the time will be interrupted to make way for the on-demand job.”

The 256 core Dell cluster has a peak performance of 2.4 TFLOPS.

The concept of real-time access to HPC isn’t new, but the missions that I’m aware of for these systems have either been as dedicated support systems — for example for live fire tests — application debugging, or scientific visualization.

The SDSC application is interesting, and I see demand for this kind of thing increasing over time. I’d like to see FEMA step up to fund regional- and national-scale computational disaster response and planning centers (think hurricanes, fires, floods, plagues and terrorist acts). When not responding interactively to disasters, the systems could be running planning scenarios for major risk areas around the country.

>>Chip advertising costs billions, not working so well?
 
Investor’s Business Daily is running a story about the money the chipmakers spend to get customers familiar with their wares. There are billions of dollars being spent, but a recent survey by research firm In-Stat found that branding in the chip industry isn’t working well.

“One finding: Consumers often know chip brand names such as Centrino and Opteron, but they don’t know that Intel makes Centrino and Advanced Micro Devices makes Opteron.”

And then they carelessly let their freakishly confusing codenames circulate for years in the press, further diluting their brands and so thoroughly confusing everyone it’s a wonder anyone can remember even the most basic facts about chip products.

The good news, though, is that Intel is on top of it.

” ‘Is there confusion over brand names? Definitely,’ said Donald MacDonald, Intel vice president for global marketing. ‘This isn’t new. We identified this problem many years ago.’ “

Excellent. So long as they’ve identified it, we’re all good to go. The full story is an interesting read (http://biz.yahoo.com/ibd/070710/tech.html?.v=1).

—–

John West summarizes the headlines in HPC every day at insideHPC.com, and writes on leadership and career issues for technology professionals at InfoWorld and on his own blog at onlytraitofaleader.com. You can contact him at john@insidehpc.com.