The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
December 09, 2008
The second wave of GPGPU software development tools is upon us. The first wave, exemplified by NVIDIA's CUDA and AMD's Brook+, allowed early adopters to get started with GPU computing via low-level, vendor-specific tools. Next generation tools from The Portland Group Inc. (PGI) and French-based CAPS Enterprise enable everyday C and Fortran programmers to tap into GPU acceleration within an integrated heterogeneous computing environment.
Over the past five years, the HPC community coalesced around the x86 architecture. That made the choice of targets easy for companies like PGI. Today 85 percent of the TOP500 is based on 64-bit x86 microprocessors, and the percentage is probably even higher in the sub-500 realm. While Intel and AMD are continuing to innovate with multicore architectures, they are constrained to clock frequencies of around 2.5-3.5 GHz.
Meanwhile, GPUs have become general-purpose vector processors with hundreds of simple cores and are scaling at a faster rate than CPUs. The fact that compiler vendors like PGI are now targeting GPUs says a lot about where the industry is headed with general-purpose acceleration, especially in the HPC space.
"As a compiler vendor, we asked ourselves: 'What comes next?'" said Doug Miles, director of Advanced Compilers and Tools at PGI. "Our best guess is that accelerated computing is what comes next."
PGI is betting that 64-bit x86 with "some type of accelerator" will be the new platform of choice for many HPC applications. Right now, the GPU is the accelerator du jour of supercomputing. The first accelerator target for PGI is CUDA-enabled NVIDIA GPUs. To implement it, PGI will leverage the CUDA toolchain and associated SDK, while the host side compilation will rely on PGI's x86 technology.
Since GPUs are attached to the host platform as external devices rather than as true coprocessors, the low-level software model is quite complex. From the host side, it involves data transfers between the CPU and the GPU (over PCIe), memory allocation/deallocation, and other low-level device management. On the GPU side, the code can also be fairly involved, since it has to deal with algorithm parallelization and the GPU's own memory hierarchy.
To make GPUs programming more productive, it's worthwhile to hide most of these details from the application developer. What PGI has done is define a set of C pragmas and Fortran directives that can be embedded in the source code and direct the compiler to offload the specified code sequences to the GPU.
This approach is analogous to OpenMP, which defines pragmas and directives to apply multithreading on top of a sequential program. Unlike a libraries-based approach, this model enables developers to maintain a common source base for a variety of different targets. In the PGI case, non-accelerator aware compilers can use the same source, but will just ignore the foreign pragmas or directives. Even within the PGI environment, the accelerator pragmas and directives can be switched off at compile time so that only x86 code is generated.
The general form the C accelerator pragma is #pragma acc directive-name [clause [,clause]…] ; the equivalent for Fortran is !$acc directive-name [clause [,clause]…]. Applying an accelerator region to a matrix multiplication loop in Fortran would look like this:
module mymm
contains
subroutine mm1( a, b, c, m )
real, dimension(:,:) :: a,b,c
integer i,j,k,m
!$acc region
do j = 1,m
do i = 1,n
a(i,j) = 0.0
enddo
do k = 1,p
do i = 1,n
a(i,j) = a(i,j) + b(i,k) * c(k,j)
enddo
enddo
enddo
!$acc end region
end subroutine
end module
(Digg, Technorati, more)
PGI Accelerator™ Fortran 95/03 and C99 compilers for x64+NVIDIA
Accelerate applications on x64+GPU platforms by adding OpenMP-like compiler directives to existing Fortran and C programs. Available now for Linux, MacOS and Windows. Download a free 15 day trial.
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
Mar 19 | OfficialWire | New super to support intelligence work Down Under. Read more...
Mar 18 | ChannelWeb | Westmere parts already showing up in HPC machines. Read more...
Mar 17 | The Register | But what about the tier ones? Read more...
Mar 17 | Cadalyst Magazine | A new generation of workstations is changing the nature of technical computing. Read more...
Mar 17 | Linux Magazine | Latest iteration of Sun Grid Engine able to tap into Cloud. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html