Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
August 31, 2007

The Week in Review

by John E. West

Here’s a collection of highlights, selected totally subjectively, from this week’s HPC news stream as reported at insideHPC.com and HPCwire.

>>10 words and a link

In odd move, Sun changes ticker symbol to JAVA;
http://insidehpc.com/2007/08/27/get-your-java-today/

NITRD soliciting comment on federal networking research plan;
http://insidehpc.com/2007/08/27/federal-networking-plan-wants-your-comments/

Researchers build foundation for photo-transistors;
http://insidehpc.com/2007/08/28/researchers-build-foundation-for-photo-transistors/

Startup offers up to 500 GB of network attached cache;
http://insidehpc.com/2007/08/28/gear6-500-gb-of-memory-attached-storage/

Purdue studies undergraduate parallel programming;
http://insidehpc.com/2007/08/28/purdue-to-study-undergraduate-parallel-programming/

Intel readies 3 Penryns sporting 1600MHz FSBs;
http://insidehpc.com/2007/08/30/this-little-penryn-goes-to-market/

>>AMD announces ISA extensions for HPC

AMD has announced they’re adding new instruction set extensions designed to improve performance in HPC, multimedia, and security apps.

The extensions, called SSE5, evolve the Streaming SIMD Extensions introduced originally in 1999. Although AMD is making the specification available starting today to foster a dialogue with developers, they won’t appear in product until AMD’s Bulldozer core is available in 2009. (Really? Bulldozer?)

The Register dug into the spec a little (http://www.theregister.co.uk/2007/08/30/amd_sse5/):

For one, AMD will follow the RISC crowd with support for 3-Operand Instructions — up from two. So, unlike in the past where you would do A plus B and then have to store the result of the operation in A or B, developers can now store the result in a third location. This should reduce the total number of instructions needed to perform certain tasks and require less effort on the part of developers to keep track of registers.

The support for 3-Operand Instructions allows AMD to roll out a “fused multiply accumulate” instruction as well. This melds multiplication and addition to permit “iterative calculations with one instruction.”

Read the spec for yourself at http://developer.amd.com/sse5.jsp.

>>Coming soon to Omaha: 9,200 core dual boot cluster

The University of Nebraska at Omaha is building a new supercomputer. According to ClusterMonkey (http://www.clustermonkey.net//content/view/209/2/), the machine is reported to have 1,151 Dell servers with dual-socket Barcelonas for a total of over 9,200 cores. The system is slated to be a dual boot system offering both Linux and Windows CCS. More at http://insidehpc.com/2007/08/29/coming-soon-to-omaha-9200-core-dual-boot-cluster/.

>>Analyst digs in on Intel’s HyperTransport copy

You’ve no doubt heard of Intel’s Common System Interface by now. This is Intel’s shot at moving its own architectures away from the front side bus in the way that AMD did with HyperTransport.

The Register reports (http://www.theregister.co.uk/2007/08/28/intel_csi_kanter/) on the work of analyst David Kanter at Real World Technologies, who has dug through patents and interviewed engineers to put together a detailed report on what he thinks CSI will look like. We won’t get to hear Intel’s plans for its technology until the IDF next month.

An excerpt from Kanter’s report:

Unlike the front-side bus, CSI is a cleanly defined, layered network fabric used to communicate between various agents. These ‘agents’ may be microprocessors, coprocessors, FPGAs, chipsets, or generally any device with a CSI port…. Initial CSI implementations in Intel’s 65nm and 45nm high performance CMOS processes target 4.8-6.4GT/s operation, thus providing 12-16GB/s of bandwidth in each direction and 24-32GB/s for each link.

You can read Kanter’s report at http://realworldtech.com/page.cfm?ArticleID=RWT082807020032&p=1.

>>IBM introduces new cell blade

IBM announced this week the release of the QS21 cell blade. The 21 replaces the QS20 and includes two cell processors, more memory, dual GbE, and dual-port QDR IB. According to the company:

The IBM BladeCenter QS21 is one of the most power efficient computing platforms to date, generating a measured 1.05 Giga Floating Point Operations Per Second (GigaFLOPS) per watt. With its peak performance of approximately 460 GFLOPS, clients can achieve 6.4 Tera Floating Point Operations Per Second (TeraFLOPS) in a single BladeCenter chassis and over 25.8 TeraFLOPS in a standard 42U rack.

The line of cell blades are an attempt by IBM to move the benefits of HPC into the enterprise by providing exotic technology in a familiar package. (Links to more detailed coverage at http://insidehpc.com/2007/08/30/ibm-intro-news-cell-blade/.)

—–

John West summarizes the headlines in HPC news every day at insideHPC.com. You can contact him at john@insidehpc.com. Too busy to keep up? Make your commute productive and subscribe to the Weekly Takeout, insideHPC.com’s weekly podcast summary of the HPC news week in review.