The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
February 23, 2007
The evolution of 4th generation surgery tools will help spread brain surgery to the masses, altogether dispensing with neurosurgeons in small hospitals that cannot afford their high pay.
Do you feel that I am pulling your leg? I am. But so is the HPCwire editor when he claims that 4th generation programming languages will make HPC programming available to the masses. Programming -- at least, programming of a large, complex code -- is a specialized task that requires a specialist -- a software engineer -- to the same extent that brain surgery requires a specialist. You might claim that many non-specialists do write code. It is also true that most of us take care of our routine health problems. But only a fool would try brain surgery because he was successful in removing a corn from his foot. To believe that better languages will soon make software engineers redundant is to believe that Artificial General Intelligence will progress much faster than most of us expect, or to belittle the specialized skills of software engineers.
The editor expects that the masses will program clusters on their own, while software engineers will continue to be needed for programming leading-edge supercomputers. However, the difference is not between bleeding-edge supercomputers and clusters. It is between complex programming tasks and simple programming tasks. Writing a simple program with little concern with performance and not too much worry about correctness is tantamount to removing a corn. Even writing a simple program -- for example, an FFT routine that achieves close to optimal performance, is tantamount to brain surgery, even if the target system is the processor that operates my laptop. Writing a moderately complex program that is bug free with high confidence and can be used to control a critical system is also tantamount to brain surgery. Finally, writing a large, complex program that more or less satisfies specifications seems to be harder than brain surgery (large software projects seem to have a higher mortality rate than brain surgery patients).
Programming is harder when the program is more complex and when constraints of high efficiency or high confidence are stricter. Performance constraints can appear on large systems, and can appear on small systems: It can be extremely hard to shoehorn a compute intensive application into the power and memory constraints of a cell phone. Performance can matter a lot for cluster programs that are frequently used: the programmers of the MPI or ScaLAPACK libraries have good reasons to carefully tune the performance of their libraries on clusters: these libraries consume many cycles on many clusters, and improving their performance will improve the performance of many applications. While the difficulty of performance tuning relates to the complexity of the target architecture, one can well argue that a cluster is a more complex architecture than a leading edge supercomputer, because of the more complex software environment and the less controllable behavior of commercial LAN switches.
There is no obvious reason for cluster programs to be smaller or for confidence requirements on clusters to be less stringent than for supercomputers. However, it is true that supercomputer computations are more likely to be resource constrained than cluster computations. Indeed, a program will be run on a capability platform only if it cannot execute in a reasonable time on a smaller cluster. Such programs may tax even the resources of a leading edge supercomputer. On the other hand, performance may be less critical for cluster programs that do not consume significant hardware resources. I am not sure this represents a large fraction of cluster cycles.
The editor draws a dichotomy between MPI with C or Fortran for the high priests of supercomputing and MATLAB or SQL for the masses. This dichotomy is false. The most high-performing commercial transaction systems use SQL, but SQL by itself does not make a commercial application. Such an application will use a variety of services and frameworks, and will be written in a variety of programming languages. SQL itself is written, by experts, in C or other such language.
The same holds true for scientific and engineering computations, be it on clusters or on supercomputers: Whenever possible, users will use available libraries or frameworks. The libraries will be implemented in Fortran, C or such similar languages, and users will use these languages for their glue code. Libraries have been used for many decades to extend the expressiveness of low level programming languages such as Fortran or C.
Computational frameworks are increasingly replacing low level programming languages as the main mechanism for expressing computations in many domains. Such frameworks can be specialized by plugging in specific methods, often written in lower level languages, and can be extended in a variety of ways. I am not sure what the difference between a well-designed computational framework (such as Cactus) and a "fourth generation language" or "domain specific language" is. Such computational frameworks are domain specific. They emphasize higher levels of abstraction, and the execution model is often interactive. Furthermore, computational frameworks are increasingly used for codes that run on the largest supercomputers.
Programming on supercomputers, like programming on any other platform, is likely to evolve toward higher level, more powerful programming languages or frameworks. The use of languages or frameworks that are more extensible, have more powerful type systems with better type inference, and provide support for generic programming, are safer and increase productivity. Such high-level languages are likely to have specialized idioms for specific application domains. To a large extent this is already true for languages such as Java or C#, since much programming is done using powerful domain specific classes. For example, programming a GUI in Java using Swing is very different from programming a business application using Enterprise JavaBeans, and programmers specialize in one or the other. To the same extent, languages such as C# or Java, or next generation languages, can be extended with idioms for scientific computing. This has been done for Java (www.javagrande.org).
The evolution of programming language and compiler technology provides more powerful mechanisms for language extension. The extension mechanisms encompass not just predefined and pre-coded methods. Code generation can occur at run-time or, indeed, whenever new relevant information on characteristics of the computation becomes available. The user can control at various levels the implementation mechanisms for the high-level objects and their methods and even the implementation mechanisms for control structures. The Telescopic Languages project of the late Ken Kennedy or the Fortress language project at Sun are showing the strength of such techniques.
Page: 1 of 4(Digg, Technorati, more)
New Paper: Parallel Computing Without Parallel Programming
Learn how domain experts can run VHLL programs like MATLAB® on a variety of high-performance platforms without low-level reprogramming and how to work with the largest datasets and complex algorithms without sacrificing ease of use or reducing productivity.
Jul 09 | Engineer Live | The demand for computational tools to underpin the 3D seismic interpretation process has never been more apparent. Read more...
Jul 08 | EE Times | Unemployment for U.S. engineers has reached record levels, according to government figures. Read more...
Jul 08 | Network World | Global spending for 2009 projected to drop 6 percent, for a total of $3.2 trillion. Read more...
Jul 08 | Linux Magazine | Portability or efficiency? Neither is guaranteed when writing explicit parallel code. Read more...
Jul 07 | Ars Technica | Japanese company builds custom ASIC to accelerate real-time ray traced rendering for the auto industry. Read more...
Jul 10 | | Engineers, scientists, and other domain experts depend on the productivity enabled by very high-level language (VHLL) tools like MATLAB® and Python. However, as datasets grow larger and programs get more sophisticated, ordinary desktop computers can no longer keep up. The paper explores how to run VHLL programs on high-performance platforms without low-level reprogramming. Work with large datasets and complex algorithms without sacrificing ease of use or reducing productivity.
Apr 14 | | Many HPC IT departments are feeling the rising pressure to deliver more capacity computing and performance while trying to reduce the total cost of ownership. This white paper discusses how an environmentally-friendly and open-standards HPC building block based computing system using flexible interconnect options helps address capacity computing needs.
Source: Addison Snell, GM/VP, Tabor Research; sponsored by Dell
Many organizations that could benefit from the use of HPC clusters find that it is complicated to get the systems up and running because of limited IT resources or the complexities of the clusters themselves. Learn how the Intel Cluster Ready program, for which Dell was an original partner, seeks to address this challenge for entry level and mid-range HPC users.
BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.
Sun Studio Compilers and Tools and Sun HPC ClusterTools allow you to create high performance parallel applications for OpenSolaris, Solaris and Linux. Sun Studio Express 11/08 includes MPI performance analysis capabilities and full OpenMP 3.0 compiler support. Learn about all this and the latest in Sun HPC ClusterTools 8.1.