November 23, 2011

We Need More Than Multicore

Nicole Hemsoth

In a recent article in the HPC Source magazine, HPC consultant Wolfgang Gentzsch discusses the good, the bad, and the ugly of multicore processors. The good: their great performance potential and recent software development environments which provide excellent support for multicore parallelization. The bad: you won’t really re-write all the billions of lines of code out there, would you? Even if you wanted to, how many algorithms resist parallelization, bullheadedly, because they are simply serial? And the ugly: all efforts are for nothing when running even the greatest core-parallel codes in a multi-user multi-job environment. And, hybrid systems will further complicate the challenge of optimizing system utilization. And, it’s all getting worse:

Since the first multicore announcements seven years ago, we have witnessed the release of 2-core, 4-core, 6-core, 8-core, 12-core and, with the latest AMD Interlagos and Fujitsu Sparc64-IXfx, 16-core processors.  In 2012, organizations will be deploying large numbers of relatively low cost 32, 64, even 128 core servers, and one can infer from processor roadmaps that core counts will continue rising at a rapid pace. Yes, Moore’s Law lives on.   

Remember Amdahl’s Law of Serialization? One of the natural boundaries we faced when we tackled vector machines; then parallel machines; and now again, multicore machines. Still, with vectors and parallel processes life was good; our jobs used a dedicated system (and by the way wasted a lot of system resources) and performance was mostly predictable. Now, with fine-grain thread-parallel codes we could fully and simultaneously exploit all system functions, in real time; we could… But, given the need to run many concurrent tasks, each competing for shared system resources, optimizing multicore system performance becomes a non-trivial exercise. 

The article further discusses multicore challenges in the context of workload managers, micro-level scheduling, time-slice based operating systems, resource contention, kernel-level parallelization, resource allocation via dynamic intelligence, and the MCOPt multicore manager, which inserts an intelligent ‘traffic manager’ into the kernel. 

Enabling multicore technology to deliver its promising potential will enhance application performance and aid in server consolidation/energy efficiency efforts.  Getting there will require that applications be parallelized to the extent possible and that the OS be augmented with intelligence that allows many concurrent running tasks to gracefully share system resources.  Let’s not forget about Amdahl’s Law!

Wolfgang’s full article just appeared in the SC11 Supplement of the HPC Source magazine at

Tags: ,