The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
From the Editor | Main Blog Index
November 10, 2006
With just a weekend between us and the Supercomputing 2006 Conference (SC06) in Florida, most of my thoughts have already turned to Tampa. One thing I'm personally intrigued with this year is the choice of the conference keynote speaker, Ray Kurzweil. Although not a supercomputing groupie in the classical sense, Kurzweil has made a name for himself as an information technology visionary.
His latest book, "The Singularity is Near: When Humans Transcend Biology," is a compendium of much of his thinking over the last two decades. In the book, Kurzweil describes how, in the not too distant future, we will develop computer intelligence that will far exceed that of human intelligence. At that point, biological and non-biological intelligence will merge and the human race will reach what he calls "Singularity." Kurzweil says that at this point, technological change will proceed so rapidly that it will represent "a rupture in the fabric of human history."
A number of futurists have proposed a similar vision, but Kurzweil has put an interesting twist on it. Since he sees the rate of technological growth as a purely exponential progression rather than a linear progression, the attainment of Singularity will come within this century. He backs this up by citing a natural phenomenon called "the law of accelerating returns," in which an evolutionary process, such as technological innovation, creates a positive feedback loop to continuously accelerate the rate of change.
He's not claiming that specific technologies are on exponential tracks. For example, Moore's Law, which states that the number of transistors on a silicon chip will double every 18 months, will eventually run out of steam. Kurzweil predicts that Moore's Law will die a dignified death no later than 2019 as the limitations on semiconductor physics take hold. But just as vacuum tubes disappeared from computers in the 1960s, the broader trend of computing evolution will continue on beyond silicon chips. Kurzweil himself is betting on three-dimensional molecular computing after 2020.
Not everyone shares Kurzweil's take on the future. Professional technology kibitzers, such as Kevin Kelley and John Horgan, have written well-considered critiques of Kurzweil's transhumanistic views. In a recent (November 5th) CSPAN interview, a caller from Oak Ridge National Laboratory (ORNL) phoned in and labeled him a "crackpot." The ORNLian said Kurzweil's explanation of exponential technological growth was "bogus" and challenged him on some specific assertions. Kurzweil, -- obviously no stranger to these types of attacks -- calmly defended his views and proceeded to the next caller.
Kurzweil is no crackpot. He is a recognized authority in the fields of computer science and artificial intelligence. Among his inventions are the first computer-based reading machines for the blind. In 2002, Kurzweil was inducted into the U.S. Patent Office's National Inventors Hall of Fame. He has received numerous awards and accolades, including the Lemelson-MIT Prize, the National Medal of Technology and ACM's Grace Murray Hopper Award. While not collecting awards, Kurzweil is busy developing his nine businesses in OCR, music synthesis, speech recognition, reading technology, virtual reality, financial investment, cybernetic art, and other areas of artificial intelligence.
Barbara Horner-Miller, the SC06 chair, had this to say about Kurzweil: "The role of the keynote speaker is to get attendees thinking and interacting. So ideally it is someone who is interesting, stimulating and somewhat controversial. As soon as Ray Kurzweil's name came up, I knew we had our speaker ...."
-----
Locks Be Gone
Back to the present. Before we start building 3-D compute engines, we're going to need to figure out multi-threaded programming first. There was an interesting article in Technology Review last week called "The Trouble with Multi-Core Computers" that talks about some of the multi-threading programming challenges. The author, Kate Green, focuses on an approach called "transactional memory," which allows the programmer to use shared data in a multi-threaded environment without having to manage locks.
Writes Green: "It actually allows numerous transactions to share the same memory at the same time. When a transaction is complete, the system verifies that other transactions haven't made changes in the memory that would hinder the outcome of the first transaction. If they have, then the transaction is re-executed until it succeeds."
Transactional memory models, like the MIT one cited in this article, usually rely on some combination of software and hardware to work. A software-based model is called software transactional memory (STM), and until hardware assistance is developed, STM is the only practical implementation.
There are a multiple benefits to transactional memory. The obvious one is that the programmer is relieved of the burden of managing thread-safe critical regions to keep his data coherent. Not only does this simplify the coder's job, it also removes the threat of deadlocks, the bane of multi-threaded programming and the cause of many a sleepless night for the software engineer.
And for the performance obsessed, transactional memory can increase concurrency over lock-based approaches -- perhaps substantially. This is because the threads no longer have to wait for access to shared memory. In addition, different threads can be working on different parts of the same data structure that would normally be controlled by a single lock. At this point you might be thinking: Haven't we just shifted the overhead of synchronization to the memory system? Yes and no. The transactional memory approach relies on the fact that data contention between threads is a rare occurrence. Most of the time only a single thread is reading or writing a particular data item. So instead of paying the price of synchronization at every access, a transactional system only needs to track memory requests and sort things out when a collision occurs.
I say only, but in reality sorting out the memory accesses turns out to be the fundamental problem with transactional memory. Maintaining the order of memory accesses is difficult. Some of the models get a little loose with the memory ordering, and while that appeals to hardware designers, software developers expect deterministic memory access.
I'll close with a comment from the High-End Crusader, who offers his perspective on the Technology Review article:
"Does Kate Green's short piece on MIT's Krste Asanovic do a better job of articulating the problems that computing faces -- as we transition to homogeneous (and heterogeneous) polycore* processor dies -- than will next Friday's SC06 distinguished panel on multicore? Kate calls for the reinvention of parallel programming. She is clearly right. This is the $64,000 question in polycore.
"Programming is possible precisely when the programming abstractions are an order of magnitude less burdensome than the execution abstractions, which are managed by the runtime system. Designing good programming abstractions requires a good nose for which execution abstractions are most dangerous.
"If you think about it, transactions abstract from synchronization. Of course, microarchitects will obsess about whether the cores should transact against on-die shared memory or against off-die shared memory or both? That's their thing.
"But the real question stems from the hard fact that parallel computing is now, and has always been, a hard sell. Perhaps the wholesale replacement of synchronization by transactions will tempt Joe Programmer to hop on board the parallel-computing vessel. We need him. We need to offer him beer (or single-malt scotch) that goes down smooth. The future of computing depends on it."
* In this context, "multicore" means 2X, 4X, ... "polycore" means 128X, 256X, ...
-----
As always, comments about HPCwire are welcomed and encouraged. Write to me, Michael Feldman, at editor@hpcwire.com.
Posted by Michael Feldman - November 10 @ 12:00AM
(Digg, Technorati, more)
New Paper: Parallel Computing Without Parallel Programming
Learn how domain experts can run VHLL programs like MATLAB® on a variety of high-performance platforms without low-level reprogramming and how to work with the largest datasets and complex algorithms without sacrificing ease of use or reducing productivity.
Michael Feldman is the editor of HPCwire.
More Michael Feldman
still innovative by PhoenixW
Rediculous notion! by jimmymac
The benchmark is completely wrong. by Patrick LEE
SiCortex / Betamax by KevinButerbaugh
Good Luck to Silicon Graphics by Rick_Mandahl
It's About Realism not Speed by cyberdyne
SGI, Not Alone by EricS
Re: Obama Pushes Science Agenda by lwalker701
The battleground... by rgreen1
How it went wrong for SGi by atzanov
Harder than chess by addisonsnell
Debt consolidation by EliasV
Re: Recession Takes a Bite Out of Supercomputing by CooperO
How it went wrong for SGi by shawnu
How it all went wrong for SGI by jmh900
Torn between IRIX and Linux by Merblich
Sun Microsystems by IsaacU
New Search Engine Duck Duck Go by yegg
GlobalFoundries and IBM ? by gutiea
GlobalFoundries and IBM ? by gutiea
HPC Market by Flamingo
Fusion Cloud Rendering by gary@amd
Fusion Cloud Rendering by gary@amd
Not cores, but memory! by dmpase
Are you on Intel's payroll? by jimmymac
anchos by addisonsnell
anchos? by in_the_crease
Here's to Cray accuracy over HPCwire's. by taylors
Tech community prefers Pepsi to Coke by cogsci
Spider, the world's biggest Lustre-based, centerwide file system, has been fully tested to support Oak Ridge National Laboratory's new petascale Cray XT4/XT5 Jaguar supercomputer and is now offering early access to scientists.
Read More...
Wolfram Alpha, the Web-based computational engine introduced in May, is not a traditional supercomputing application, but relies on supercomputers to satisfy its unique requirements.
Read More...
There was a new energy at this year's TeraGrid '09 conference thanks to an outstanding turnout for the student program. Thanks to support from the National Science Foundation, more than 100 high school, undergraduate and graduate students were able to participate in the conference.
Read More...
Jul 09 | Engineer Live | The demand for computational tools to underpin the 3D seismic interpretation process has never been more apparent. Read more...
Jul 08 | EE Times | Unemployment for U.S. engineers has reached record levels, according to government figures. Read more...
Jul 08 | Network World | Global spending for 2009 projected to drop 6 percent, for a total of $3.2 trillion. Read more...
Jul 08 | Linux Magazine | Portability or efficiency? Neither is guaranteed when writing explicit parallel code. Read more...
Jul 07 | Ars Technica | Japanese company builds custom ASIC to accelerate real-time ray traced rendering for the auto industry. Read more...
Jul 10 | | Engineers, scientists, and other domain experts depend on the productivity enabled by very high-level language (VHLL) tools like MATLAB® and Python. However, as datasets grow larger and programs get more sophisticated, ordinary desktop computers can no longer keep up. The paper explores how to run VHLL programs on high-performance platforms without low-level reprogramming. Work with large datasets and complex algorithms without sacrificing ease of use or reducing productivity.
Apr 14 | | Many HPC IT departments are feeling the rising pressure to deliver more capacity computing and performance while trying to reduce the total cost of ownership. This white paper discusses how an environmentally-friendly and open-standards HPC building block based computing system using flexible interconnect options helps address capacity computing needs.
Source: Addison Snell, GM/VP, Tabor Research; sponsored by Dell
Many organizations that could benefit from the use of HPC clusters find that it is complicated to get the systems up and running because of limited IT resources or the complexities of the clusters themselves. Learn how the Intel Cluster Ready program, for which Dell was an original partner, seeks to address this challenge for entry level and mid-range HPC users.
BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.
Sun Studio Compilers and Tools and Sun HPC ClusterTools allow you to create high performance parallel applications for OpenSolaris, Solaris and Linux. Sun Studio Express 11/08 includes MPI performance analysis capabilities and full OpenMP 3.0 compiler support. Learn about all this and the latest in Sun HPC ClusterTools 8.1.