The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
April 27, 2009
Parallel Algorithms by Henri Casanova, Arnaud Legrand, and Yves Robert (CRC Press, 2009) is a text meant for those with a desire to understand the theoretical underpinnings of parallelism from a computer science perspective. As the authors themselves point out, this is not a high performance computing book -- there is no real attention given to HPC architectures or practical scientific computing. You also won't leave this book a competent parallel programmer ready to implement an application. But you will have the tools you need to continue on a rigorous research track into the computer science aspects of parallel computing.
The preface describes the text as aimed at graduate students and postgraduate researchers in computer science, and this is dead on. The book is very general, and very theoretical, with proofs, theorems, lemmas, complexity analysis, and the whole nine yards. In the quest to maintain generality and build a theoretical framework for understanding research aspects of parallel algorithms, you don't get much further than matrix-matrix multiplication and basic stencil computation in terms of practical discussion of algorithms. Each chapter includes a thorough problem set that extends the topics covered in the chapter; solutions are provided for select problems.
The book is organized into three sections: models, parallel algorithms, and scheduling. The models section begins with (chapters 1-2) coverage of classic theoretical models of computing in parallel, PRAM and sorting networks. Chapter 3 is about the models of communications networks needed to reason about the complexity and general effectiveness of algorithms when implemented on specific hardware. This chapter talks about topologies like cliques, rings, grids, and variants of the torus and hypercube, and also touches on models for peer-to-peer computing networks.
Chapters 4 and 5 discuss parallel algorithms on rings and grids of processors. The algorithmic discussion is a foundation upon which to develop the theoretical tools for reasoning about the performance and complexity of parallel algorithms in general. These chapters are not meant to be useful implementation guides for those developing applications. The authors examine matrix-vector and matrix-matrix multiplication as well as basic stencil computations and LU factorization. Basic data distribution patterns are also discussed (block, cyclic, etc.), and are used in conjunction with an analysis of the algorithms in the context of the communication network models to understand the theoretical performance advantages and general relevance of data distribution to effective parallel computation. These sections of chapters 4 and 5 establish theoretical foundations for some of the rules of thumb we have in HPC, explaining why they work, and they are used to establish some generalities about the virtues and vices of some of the topologies with respect to one another.
The remaining chapters are about workload management. Chapter 6 addresses load balancing within an application running on a heterogenous platform, e.g., a cluster with some fast and some slow(er) processors. The chapter builds the fundamental discussion based on one-dimensional data distributions for which there are accessible solutions and examines those in the context of stencil computations and LU factorization. Then the authors address the difficulties with balancing load in two-dimensional data distributions. Chapters 7 and 8 address task graph scheduling algorithms. Chapter 7 addresses the fundamentals and provides the definitions and theorems needed to prove characteristics of task graph scheduling approaches. Chapter 8 advances this discussion and addresses scheduling of divisible load applications, throughput optimization for master-worker applications in steady-state, scheduling of independent tasks, and loop nest scheduling.
Parallel Algorithms is a book you study, not a book you read. Those well past their CS finals or long out of the research aspects of computer science may find portions of the discussion inaccessible. But those motivated to work through the text will be rewarded with a solid foundation for the study of parallel algorithms.
Parallel Algorithms (Chapman & Hall/Crc Numerical Analy & Scient Comp. Series)
(Digg, Technorati, more)
PGI Accelerator™ Fortran 95/03 and C99 compilers for x64+NVIDIA
Accelerate applications on x64+GPU platforms by adding OpenMP-like compiler directives to existing Fortran and C programs. Available now for Linux, MacOS and Windows. Download a free 15 day trial.
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
C-DAC announces plans for a petaflop system; IBM researchers are working on vertical integration techniques to extend Moore's Law another 15 years. We recap those stories and more in our weekly wrapup.
Read More...
The Moscow State University supercomputer, Lomonosov, has been selected for a high-performance makeover, with the goal of tripling its processing power to achieve petaflop-level performance in 2010. T-Platforms, who developed and manufactured the supercomputer, is the odds-on favorite to lead the project.
Read More...
Right on schedule, Intel has launched its Xeon 5600 processors, codenamed "Westmere EP." The 5600 represents the 32nm sequel to the Xeon 5500 (Nehalem EP) for dual-socket servers. Intel is touting better performance and energy efficiency, along with new security features, as the big selling points of the new Xeons.
Read More...
Mar 19 | OfficialWire | New super to support intelligence work Down Under. Read more...
Mar 18 | ChannelWeb | Westmere parts already showing up in HPC machines. Read more...
Mar 17 | The Register | But what about the tier ones? Read more...
Mar 17 | Cadalyst Magazine | A new generation of workstations is changing the nature of technical computing. Read more...
Mar 17 | Linux Magazine | Latest iteration of Sun Grid Engine able to tap into Cloud. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html