Visit additional Tabor Communication Publications
HPC Matters is a joint blog consisting of contributors from the Tabor Communications team on their observations and insights into HPC matters.
October 22, 2010
Moore's Law is dead, or is it? There's the camp that believes Moore's Law, which states that transistor density on integrated circuits doubles about every two years, will be viable for only another decade or two. But there's another camp that thinks the technology already exists to extend the trend: multicore processors. National Instruments' P.J. Tanzillo is a proponent of the latter theory and has written an article on the subject at Technology Review.
The general purpose computing market has made another quantum leap in processing power in the last five years, but this time it's not in clock rates, it's in the number of processing cores. Contrary to popular belief, Moore's Law is not dead. The number of transistors on modern processors continues to double every 18 months. Those transistors are now just manifesting themselves as additional processing cores. There are two primary reasons that this shift has been made: power and memory.
Tanzillo goes on to explain that with single-core processors, one way to increase performance is to increase clock rates, but with heating and energy concerns, that only goes so far. The increased density of multicore processors allows each core to be clocked well below its theoretical maximum, which assists with heat dissipation and power management.
As for the memory problem, Tanzillo relates how DRAM memory speed has been unable to keep pace with increases in microprocessor speed. Both are increasing exponentially, but with micoprocessors, there is a larger exponent. This creates a situation where memory latency becomes the biggest bottleneck to system performance. This is also known as the memory wall problem. Although it would be nice to think multicore has solved this problem, it's really just postponed it a bit. The disparity still exists.
Machines with multiple applications that are each well suited to running on one core (as with a desktop computer) can take advantage of multicore architectures rather easily, with little reprogramming. But HPC presents a challenge because you have one application that must be divied up to run on multiple cores. Tanzillo explains:
So, just like the supercomputing clusters of the past, algorithms written in FORTRAN and C need to be modified to take advantage of parallel processing cores. These applications need to be broken into threads and these threads need to be designed to avoid some of the common mistakes in parallelization of code like race conditions and priority inversion. In addition, memory and communication between processes must be made thread-safe, and shared resources need to be avoided or addressed. These issues continue to haunt developers updating legacy code to new architectures, and they often result in instability and/or disappointing performance gains. As a result, a set of complementary technologies are growing into maturity that allow programmers to take advantage of multicore systems in new and interesting ways.
Some of those "new and interesting ways" revolve around dataflow programming and virtualization, and cloud computing should be considered too, according to Tanzillo.
One thing to keep in mind with multicore is that the math doesn't completely work out. Ideally, doubling the cores would double the performance, but that's not quite the case, it's more of a 50% performance increase. And then there's the 2009 Sandia study that suggested performance actually decreases for machines with more than eight cores:
A Sandia team simulated key algorithms for deriving knowledge from large data sets. The simulations show a significant increase in speed going from two to four multicores, but an insignificant increase from four to eight multicores. Exceeding eight multicores causes a decrease in speed. Sixteen multicores perform barely as well as two, and after that, a steep decline is registered as more cores are added.
For an alternate perspective on the multicore debate, we can look to NVIDIA's Bill Dally, who believes that building parellel computers from the ground up using GPUs is the way to go. In his Forbes article from last April, Dally stated:
To continue scaling computer performance, it is essential that we build parallel machines using cores optimized for energy efficiency, not serial performance. Building a parallel computer by connecting two to 12 conventional CPUs optimized for serial performance, an approach often called multi-core, will not work. This approach is analogous to trying to build an airplane by putting wings on a train. Conventional serial CPUs are simply too heavy (consume too much energy per instruction) to fly on parallel programs and to continue historic scaling of performance.
The path toward parallel computing will not be easy. After 40 years of serial programming, there is enormous resistance to change, since it requires a break with longstanding practices. Converting the enormous volume of existing serial programs to run in parallel is a formidable task, and one that is made even more difficult by the scarcity of programmers trained in parallel programming.
A key point that was raised by both Tanzillo and Dally is that whether using multicore or parellel GPU-based machines, there's still the problem of parallelizing the software to take advantage of multiple processors. And it's not a minor problem. And yes, there's resistance to change. But at the end of the day, it's important to remember that while science isn't about technology, it is a primary enabler.
Posted by Tiffany Trader - October 22, 2010 @ 5:33 PM, Pacific Daylight Time
Tiffany Trader is the editor of HPC in the Cloud. With a background in HPC publishing, she brings a wealth of knowledge and experience to bear on a range of topics relevant to the technical cloud computing space.
No Recent Blog Comments
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
Supercomputing veteran, Bo Ewald, has been neck-deep in bleeding edge system development since his twelve-year stint at Cray Research back in the mid-1980s, which was followed by his tenure at large organizations like SGI and startups, including Scale Eight Corporation and Linux Networx. He has put his weight behind quantum company....
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
May 08, 2013 |
For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.