The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
From the Editor | Main Blog Index
July 02, 2008
As Intel continues to flesh out its multicore processor roadmap, Anwar Ghuloum, principal engineer with the company's Microprocessor Technology Lab, is already encouraging software developers to begin designing applications for manycore processors -- architectures that contain tens, hundreds or thousands of cores. In his latest blog entry titled "Unwelcome Advice," he makes the case that designing apps for multicore processors is somewhat of a dead end.
If you're a mainstream programmer, you might think this puzzling. The current crop of IA processors sports 2 to 4 cores and Intel's next generation Nehalem processors, due out later this year, will support between 2 and 8. Larrabee, Intel's first manycore processor family, is a couple of years down the road, and its programming guide is still under wraps. So why commit to manycore when you're struggling to slice up your application into enough threads to give the current cores something to do?
Ghuloum's point is that manycore architecture will emerge as the enduring model in the industry, implying that the transition from "multi" to "many" will happen rather quickly. "Eventually, developers realize that the end point is on the other side of a mountain of silicon innovations," he writes. The downside, Ghuloum admits, is that designing an application for manycore is a much bigger undertaking than doing a multicore port.
The latter usually involves just breaking an application into well-defined tasks that can run concurrently. In some cases, this design may naturally be teased out of the existing code framework, where the application's major functions can be logically mapped to independent threads. A trivial example is a transactional app that inputs a data item, processes it, and outputs the transformed result -- and does this over and over. Turning this from a sequential app into one with three threads (Input, Process, Output) that carry out the tasks in parallel is often fairly straightforward. But since the application is now mapped to three cores, taking advantage of a platform with a different core count involves another dive into the application. Not only that, but for most applications there is no clear functional decomposition that can get you to tens or hundreds of cores.
Long-term scalability is easier to achieve with a manycore design mindset up front, since you're forced to deal with the built-in assumption that the ultimate number of cores is not just large, but variable. Ghuloum admits this "usually requires at least some degree of going back to the algorithmic drawing board and rethinking some of the core methods they implement." He continues (and here's the really Unwelcome Advice part): "This also presents the 'opportunity' for a major refactoring of their code base, including changes in languages, libraries, and engineering methodologies and conventions they’ve adhered to for (often) most of the their software’s existence."
Ghuloum kind of glosses over how this might actually be accomplished, but in previous blog entries he has written more extensively about Intel's adventures in fine-grained parallelism and terascale computing in general. In particular, back in January, he wrote about the company's new Ct language, a C/C++ derivative, and offered some detail about how it supports manycore architectures. In a nutshell, Ct supports both data and task parallelism and is designed to tackle fine-grained parallelism in a more generalized way than is currently being done with GPGPU environments like NVIDIA's CUDA or AMD's Brooks+. The Ct runtime insulates the programmer from the underlying hardware, creating and dispatching threads as needed.
Whether anyone will heed Ghuloum's Unwelcome Advice remains to be seen. He correctly notes that the HPC crowd is already on board with the idea of automatic application scaling, but the rest of the industry may be a harder sell. It's hard to imagine that many (non-HPC) legacy codes will migrate to manycore; a lot just don't contain enough inherent parallelism to make the trip worthwhile.
The real potential for manycore architectures and software environments like Ct is that they will provide a platform for much more intelligent (and thus valuable) types of software than exist today -- applications like image-based data miners, personal health agents, and real-time market trade analyzers. (I wrote about how Intel views this application space last year.) If developers see clear monetary opportunities in manycore applications, all advice will be welcomed.
Posted by Michael Feldman - July 2 @ 6:50PM
(Digg, Technorati, more)
Michael Feldman is the editor of HPCwire.
More Michael Feldman
still innovative by PhoenixW
Rediculous notion! by jimmymac
The benchmark is completely wrong. by Patrick LEE
SiCortex / Betamax by KevinButerbaugh
Good Luck to Silicon Graphics by Rick_Mandahl
It's About Realism not Speed by cyberdyne
SGI, Not Alone by EricS
Re: Obama Pushes Science Agenda by lwalker701
The battleground... by rgreen1
How it went wrong for SGi by atzanov
Harder than chess by addisonsnell
Debt consolidation by EliasV
Re: Recession Takes a Bite Out of Supercomputing by CooperO
How it went wrong for SGi by shawnu
How it all went wrong for SGI by jmh900
Torn between IRIX and Linux by Merblich
Sun Microsystems by IsaacU
New Search Engine Duck Duck Go by yegg
GlobalFoundries and IBM ? by gutiea
GlobalFoundries and IBM ? by gutiea
HPC Market by Flamingo
Fusion Cloud Rendering by gary@amd
Fusion Cloud Rendering by gary@amd
Not cores, but memory! by dmpase
Are you on Intel's payroll? by jimmymac
anchos by addisonsnell
anchos? by in_the_crease
Here's to Cray accuracy over HPCwire's. by taylors
Tech community prefers Pepsi to Coke by cogsci
There was a new energy at this year's TeraGrid '09 conference thanks to an outstanding turnout for the student program. Thanks to support from the National Science Foundation, more than 100 high school, undergraduate and graduate students were able to participate in the conference.
Read More...
Paul Avery, a recognized leader in advanced grid and networking for science, delivered the first keynote address at the recent TeraGrid '09 conference in Arlington, Virginia. A professor of physics at the University of Florida, Avery is co-principal investigator and founding member of the Open Science Grid (OSG). Avery talked about the history of OSG, some of the projects that leverage its resources, and OSG's relationship with TeraGrid.
Read More...
Before he even took the podium, Ed Seidel was one of the buzz makers at the TeraGrid '09 conference. The day before his keynote, it was announced that he was stepping in as acting assistant director of the National Science Foundation's math and physical sciences directorate. For his talk at the conference, however, Seidel focused on the issues and efforts within his home at NSF, the Office of Cyberinfrastructure.
Read More...
Jul 09 | Engineer Live | The demand for computational tools to underpin the 3D seismic interpretation process has never been more apparent. Read more...
Jul 08 | EE Times | Unemployment for U.S. engineers has reached record levels, according to government figures. Read more...
Jul 08 | Network World | Global spending for 2009 projected to drop 6 percent, for a total of $3.2 trillion. Read more...
Jul 08 | Linux Magazine | Portability or efficiency? Neither is guaranteed when writing explicit parallel code. Read more...
Jul 07 | Ars Technica | Japanese company builds custom ASIC to accelerate real-time ray traced rendering for the auto industry. Read more...
Apr 14 | | Many HPC IT departments are feeling the rising pressure to deliver more capacity computing and performance while trying to reduce the total cost of ownership. This white paper discusses how an environmentally-friendly and open-standards HPC building block based computing system using flexible interconnect options helps address capacity computing needs.
Source: Addison Snell, GM/VP, Tabor Research; sponsored by Dell
Many organizations that could benefit from the use of HPC clusters find that it is complicated to get the systems up and running because of limited IT resources or the complexities of the clusters themselves. Learn how the Intel Cluster Ready program, for which Dell was an original partner, seeks to address this challenge for entry level and mid-range HPC users.
BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.
Sun Studio Compilers and Tools and Sun HPC ClusterTools allow you to create high performance parallel applications for OpenSolaris, Solaris and Linux. Sun Studio Express 11/08 includes MPI performance analysis capabilities and full OpenMP 3.0 compiler support. Learn about all this and the latest in Sun HPC ClusterTools 8.1.