Visit additional Tabor Communication Publications
September 01, 2006
If there is anything that will slow down the multi-core juggernaut, it is the lack of software that will run on them. While commodity multi-core chips are well-know fixtures in servers and high performance computers, the highest volume markets, represented by the desktop and laptop segments, are just now getting used to the idea of dual-core processors. Within a relatively short period of time, multi-threaded software has become everyone's problem.
Intel has a huge stake in making sure future software follows the multi-core model. By the end of next year, the majority of processors shipped by the company will be either dual-core or quad-core. Commodity processors with eight cores and above are already being conceived. Intel can't afford to have all those extra CPUs just spinning with nothing to do. So part of the company's mission has become to convince software developers to "think parallel."
The HPC community has recognized the problem for years. In 1999 during an OpenMP presentation at the Supercomputer Conference, one of the introductory slides stated: "The benefits are clear. To increase the amount of parallel software, we need to reduce the perceived difficulties." Seven years later, the benefits of parallel programming and the perceived difficulties have been extended to the entire IT community.
But Intel is not going to rely on the HPC crowd to drive the paradigm shift of parallel computing into the larger IT community. Supercomputing, while important to Intel, is not perceived as a software technology driver by the company. Back in April, when I spoke with Justin Rattner, Intel's chief technology officer, he had this to say:
"A few years ago, when Intel asked me to look at what we should be doing in HPC, I was struck by how little progress had been made on the programming front. That's my big disappointment. The technologies that were popular a decade or more ago are still in widespread use today. We're still programming in MPI and still working on technologies like OpenMP. I had hoped and expected that after a decade or more we really would have made some fundamental advancements on the software side.
"I think that HPC probably won't drive the fundamental advancements in parallel programming. I think it had that opportunity, but that window of leadership is rapidly closing. The advent of multi-core processors in the high volume spaces is probably going to do more. It's certainly going to attract a lot more investment in creating powerful solutions to the programming problem -- largely out of necessity. If these new architectures are going to be successful, a lot of people are going to have to program them and they're not going to be satisfied with the kinds of tools available in HPC today."
Part of the effort to get developers used to the idea of parallel programming involves education. Earlier this month, Intel announced a worldwide effort to prepare university students for the new multi-core paradigm. Intel will provide expertise, funding, development tools, educational materials, on-site training and collaboration to 45 top universities to incorporate multi-core and multi-threading concepts into their computer science curricula. Related to this, the company has even developed its own software college (information online at http://or1cedar.cps.intel.com/softwarecollege/), a kind of global extension program for programmers.
Intel is also taking a more direct approach to encouraging parallel programming by delivering multi-threading developer tools, such as OpenMP-capable compilers, VTune, Thread Checker and Thread Profiler. This past Monday, Intel introduced a new software package, Threading Building Blocks (TBB), a threading library for C++ developers. I talked to James Reinders, Intel marketing director for the company's Developer Products Division, about the new software and wrote about it in this week's issue.
From what Reinders told me, Threading Building Blocks is targeted for the same types of systems as OpenMP, namely shared memory SMP systems. TBB's main advantage over OpenMP is that it doesn't require a special compiler, since it relies on standard C++ templates, rather than special language pragmas, to implement its parallel constructs. TBB also provides abstractions for task parallelism (being considered for OpenMP 3.0) and critical regions.
And, since the template library provides a more straightforward way to extend language features, it offers programmers more flexibility. The OpenMP pragma-driven model is certainly powerful, but compiler directives are a questionable way to make significant extensions to language functionality. On the other hand, OpenMP has other things going for it: It is an open standard, is portable across multiple languages (C, C++ and Fortran) and already enjoys some market penetration. The OpenMP language committee is in the process of designing version 3.0 of the specification. Read more about some of the issues being discussed in this week's feature article, "The Future of User-Directed SMP Parallel Programming."
Intel already has an investment in OpenMP in its own compilers and has even produced a Cluster OpenMP version for distributed memory systems. The company designed the TBB library to coexist with OpenMP, as well as native threading code, within the same application. So rather than competing with OpenMP, Reinders characterized the TBB product as filling in some of the gaps.
What Intel would really like to accomplish is to wean developers away from native Windows and POSIX threading, a low-level approach to parallel programming that the company views as counter-productive. There's no reason to have thousands of developers devise their own thread management schemes. Not only is re-inventing the wheel time-consuming, it's also error prone. Reinders makes a good case that the sort of high-level threading model encapsulated in the TBB template library is the way to go. Says Reinders:
"We're really able to do some incredibly sophisticated things under the hood. And if you really want to get a scalable threaded application, you need to do these things. But I would not want to try to educate everyone how to write these; or even if I educated them, I wouldn't want to suggest that everyone should spend their time writing wonderful core threading capabilities like task queuing and stack management. Having them written for you and be part of the language is exactly the right thing to do."
Lest you think he is just pushing the company line, here's what Reinders had to say in "Programming for Concurrency: New Tools Arrive" on the Dr. Dobbs Journal site:
"If you don't like what Intel has to offer, please find something else. But try to avoid writing the native threads. You'll waste time and you won't end up with a scalable application."
As always, comments about HPCwire are welcomed and encouraged. Write to me, Michael Feldman, at firstname.lastname@example.org.
Posted by Michael Feldman - August 31, 2006 @ 9:00 PM, Pacific Daylight Time
Michael Feldman is the editor of HPCwire.
No Recent Blog Comments
In quieter times, sounding the bell of funding big science with big systems tends to resonate further than when ears are already burning with sour economic and national security news. For exascale's future, however, the time could be ripe to instill some sense of urgency....
In a recent solicitation, the NSF laid out needs for furthering its scientific and engineering infrastructure with new tools to go beyond top performance, Having already delivered systems like Stampede and Blue Waters, they're turning an eye to solving data-intensive challenges. We spoke with the agency's Irene Qualters and Barry Schneider about..
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
May 23, 2013 |
The study of climate change is one of those scientific problems where it is almost essential to model the entire Earth to attain accurate results and make worthwhile predictions. In an attempt to make climate science more accessible to smaller research facilities, NASA introduced what they call ‘Climate in a Box,’ a system they note acts as a desktop supercomputer.
May 22, 2013 |
At some point in the not-too-distant future, building powerful, miniature computing systems will be considered a hobby for high schoolers, just as robotics or even Lego-building are today. That could be made possible through recent advancements made with the Raspberry Pi computers.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.