HPCwire

Leading HPC
Solution Providers




















HPCwire >> Blogs

Blog: From the Editor

From the Editor | Main Blog Index

Core Convictions


As Intel continues to flesh out its multicore processor roadmap, Anwar Ghuloum, principal engineer with the company's Microprocessor Technology Lab, is already encouraging software developers to begin designing applications for manycore processors -- architectures that contain tens, hundreds or thousands of cores. In his latest blog entry titled "Unwelcome Advice," he makes the case that designing apps for multicore processors is somewhat of a dead end.

If you're a mainstream programmer, you might think this puzzling. The current crop of IA processors sports 2 to 4 cores and Intel's next generation Nehalem processors, due out later this year, will support between 2 and 8. Larrabee, Intel's first manycore processor family, is a couple of years down the road, and its programming guide is still under wraps. So why commit to manycore when you're struggling to slice up your application into enough threads to give the current cores something to do?

Ghuloum's point is that manycore architecture will emerge as the enduring model in the industry, implying that the transition from "multi" to "many" will happen rather quickly. "Eventually, developers realize that the end point is on the other side of a mountain of silicon innovations," he writes. The downside, Ghuloum admits, is that designing an application for manycore is a much bigger undertaking than doing a multicore port.

The latter usually involves just breaking an application into well-defined tasks that can run concurrently. In some cases, this design may naturally be teased out of the existing code framework, where the application's major functions can be logically mapped to independent threads. A trivial example is a transactional app that inputs a data item, processes it, and outputs the transformed result -- and does this over and over. Turning this from a sequential app into one with three threads (Input, Process, Output) that carry out the tasks in parallel is often fairly straightforward. But since the application is now mapped to three cores, taking advantage of a platform with a different core count involves another dive into the application. Not only that, but for most applications there is no clear functional decomposition that can get you to tens or hundreds of cores.

Long-term scalability is easier to achieve with a manycore design mindset up front, since you're forced to deal with the built-in assumption that the ultimate number of cores is not just large, but variable. Ghuloum admits this "usually requires at least some degree of going back to the algorithmic drawing board and rethinking some of the core methods they implement." He continues (and here's the really Unwelcome Advice part): "This also presents the 'opportunity' for a major refactoring of their code base, including changes in languages, libraries, and engineering methodologies and conventions they’ve adhered to for (often) most of the their software’s existence."

Ghuloum kind of glosses over how this might actually be accomplished, but in previous blog entries he has written more extensively about Intel's adventures in fine-grained parallelism and terascale computing in general. In particular, back in January, he wrote about the company's new Ct language, a C/C++ derivative, and offered some detail about how it supports manycore architectures. In a nutshell, Ct supports both data and task parallelism and is designed to tackle fine-grained parallelism in a more generalized way than is currently being done with GPGPU environments like NVIDIA's CUDA or AMD's Brooks+. The Ct runtime insulates the programmer from the underlying hardware, creating and dispatching threads as needed.

Whether anyone will heed Ghuloum's Unwelcome Advice remains to be seen. He correctly notes that the HPC crowd is already on board with the idea of automatic application scaling, but the rest of the industry may be a harder sell. It's hard to imagine that many (non-HPC) legacy codes will migrate to manycore; a lot just don't contain enough inherent parallelism to make the trip worthwhile.

The real potential for manycore architectures and software environments like Ct is that they will provide a platform for much more intelligent (and thus valuable) types of software than exist today -- applications like image-based data miners, personal health agents, and real-time market trade analyzers. (I wrote about how Intel views this application space last year.) If developers see clear monetary opportunities in manycore applications, all advice will be welcomed.

Posted by Michael Feldman - July 2 @ 6:50PM

(Digg, Technorati, more)

Discussion

There are 0 discussion items posted.  

Michael Feldman

Michael Feldman is the editor of HPCwire.

More Michael Feldman



Recent Comments

Feature Articles

TeraGrid '09: Student Participation Soars

There was a new energy at this year's TeraGrid '09 conference thanks to an outstanding turnout for the student program. Thanks to support from the National Science Foundation, more than 100 high school, undergraduate and graduate students were able to participate in the conference.
Read More...

TeraGrid '09: OSG and TeraGrid Collaboration

Paul Avery, a recognized leader in advanced grid and networking for science, delivered the first keynote address at the recent TeraGrid '09 conference in Arlington, Virginia. A professor of physics at the University of Florida, Avery is co-principal investigator and founding member of the Open Science Grid (OSG). Avery talked about the history of OSG, some of the projects that leverage its resources, and OSG's relationship with TeraGrid.
Read More...

TeraGrid '09: Thriving in an Exponentially Changing World

Before he even took the podium, Ed Seidel was one of the buzz makers at the TeraGrid '09 conference. The day before his keynote, it was announced that he was stepping in as acting assistant director of the National Science Foundation's math and physical sciences directorate. For his talk at the conference, however, Seidel focused on the issues and efforts within his home at NSF, the Office of Cyberinfrastructure.
Read More...

Top Headlines

3D Seismic Data: Taking a Smarter Approach to Interpretation

Jul 09 | Engineer Live | The demand for computational tools to underpin the 3D seismic interpretation process has never been more apparent. Read more...

Engineering Unemployment Soared in 2Q to 8.6%

Jul 08 | EE Times | Unemployment for U.S. engineers has reached record levels, according to government figures. Read more...

Gartner Adjusts 2009 IT Spend Downward Again

Jul 08 | Network World | Global spending for 2009 projected to drop 6 percent, for a total of $3.2 trillion. Read more...

Concurrent and Parallel Are Not The Same

Jul 08 | Linux Magazine | Portability or efficiency? Neither is guaranteed when writing explicit parallel code. Read more...

800 TFLOP Real-Time Ray Tracing GPU Unveiled, Not for Gamers

Jul 07 | Ars Technica | Japanese company builds custom ASIC to accelerate real-time ray traced rendering for the auto industry. Read more...

Featured Whitepapers

Building High Performance Computing in a Green and Modular Solution Building Block

Apr 14 | | Many HPC IT departments are feeling the rising pressure to deliver more capacity computing and performance while trying to reduce the total cost of ownership. This white paper discusses how an environmentally-friendly and open-standards HPC building block based computing system using flexible interconnect options helps address capacity computing needs.

Multimedia

Webcast: Dell Expands HPC Access and Adoption with Intel Cluster Ready Program


Source: Addison Snell, GM/VP, Tabor Research; sponsored by Dell

Many organizations that could benefit from the use of HPC clusters find that it is complicated to get the systems up and running because of limited IT resources or the complexities of the clusters themselves. Learn how the Intel Cluster Ready program, for which Dell was an original partner, seeks to address this challenge for entry level and mid-range HPC users.

Video White Paper: Architecting a Better Network Storage Solution

BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.

Webcast: HPC Development Solutions: Sun Studio & Sun HPC ClusterTools


Sun Studio Compilers and Tools and Sun HPC ClusterTools allow you to create high performance parallel applications for OpenSolaris, Solaris and Linux. Sun Studio Express 11/08 includes MPI performance analysis capabilities and full OpenMP 3.0 compiler support. Learn about all this and the latest in Sun HPC ClusterTools 8.1.

Blogs by Topics

Blogs by Author

HPC Blogroll



Featured Events

WORLDCOMP 2009
Data Mining Courses