Visit additional Tabor Communication Publications
November 02, 2009
Writing and implementing high performance computing applications is all about efficiency, parallelism, scalability, cache optimizations and making best use of whatever resources are available -- be they multicore processors or application accelerators, such as FPGAs or GPUs. HPC applications have been developed for, and successfully run on, grids for many years now.
HPC on Grid
A good example of a number of different components of HPC applications can be seen in the processing of data from CERN's Large Hadron Collider (LHC). The LHC is a gigantic scientific instrument (with a circumference of over 26 kilometres), buried underground near Geneva, where beams of subatomic particles -- called Hadrons, either protons or lead ions -- are accelerated in opposite directions and smashed into each other at 0.999997828 the speed of light. Its goal is to develop an understanding of what happened in the first 10-12 of a second at the start of the universe after the Big Bang, which will in turn confirm the existence of the Higgs boson, help to explain dark matter, dark energy, anti-matter, and perhaps the fundamental nature of matters itself.
Data is collected by a number of "experiments." each of which is a large and very delicate collection of sensors able to capture the side effects caused by exotic, short lived particles that result from the particle collisions. When accelerated to full speed, the bunches of particles pass each other 40 million times a second, each bunch contains 10^11 particles, resulting in one billion collision events being detected every second. This data is first filtered by a system build from custom ASIC and FPGA devices. It is then processed by a 1,000 processor compute farm, and the filtering is completed by a 3,400 processor farm. After the data has been reduced by a factor of 180,000, it still generates 3,200 terabytes of data a year. And the HPC processing undertaken to reduce the data volume has hardly scratched the surface of what happens next.
Ten major compute sites around the world comprising many tens of thousands of processors (and many smaller facilities) are then put to work to interpret what happened during each "event." The processing is handled, and the data distribution managed, by the LHC Grid, which is based on grid middleware called gLite that was developed by the major European project, Enabling Grids for E-sciencE (EGEE). High performance is achieved at every stage because the programs have been developed with a detailed knowledge and understanding of the grid, cluster or FPGA that they target.
From Grid to Cloud
Grid computing isn't dead, but long live cloud computing. As far as early-adopter end users in our 451 ICE program are concerned, cloud computing is now seen very much as the logical endpoint for combined grid, utility, virtualization and automation strategies. Indeed, enterprise grid users see grid, utility and cloud computing as a continuum: cloud computing is grid computing done right; clouds are a flexible pool, whereas grids have a fixed resource pool; clouds provision services, whereas grids are provisioning servers; clouds are business, and grids are science. And so the comparisons go on, but through cloud computing, grids now appear to be at the point of meeting some of their promise.
One obvious way to regard cloud computing is as the new marketing-friendly name for utility computing, sprinkled with a little Internet pixie dust. In many respects, its aspirations match the original aspirations of utility computing -- the ability to turn on computing power like a tap and pay on a per-drink basis. "Utility" is a useful metaphor, but it's ambiguous because IT is simply not as fungible as electrical power, for example. The term never really took off. Grid computing, in the meantime, has been hung up on the pursuit of interoperability and the complexity of standardization. Taking the science out of grids has proved to be fairly intractable for all but high performance computing and specialist application tasks.
Clouds usefully abstract away the complexity of grids and the ambiguity of utility computing, and they have been adopted rapidly and widely. Since then everyone has been desperately trying to work out what cloud computing means and how it differs from utility computing. It doesn't, really. Cloud computing is utility computing 2.0 with some refinements, principally, that it is delivered in ways we think are very likely to catch on.
But as cloud abstracts away the complexity, it also abstracts away visibility of the detail underlying execution platform. And without a deep understanding of how to optimize for the target platform, high performance computing becomes, well, just computing.
Human readable programs are translated into ones that can be executed on a computer by a program called a compiler. A compiler's first step is that of lexical analysis, which converts a program into its logical components (i.e., language keywords, operators, numbers and variables). Next, the syntax analysis phase checks that the program complies with the grammar rules of the languages. The final two phases of optimization and code generation are often tightly linked so as to be one and the same thing (although some generic optimizations such as common sub-expression elimination are independent of code generation). The more the compiler knows about the target systems, the more sophisticated the optimizations it can perform, and the higher the performance of the resulting program.
But if a program is running in the cloud, the compiler doesn't know any detail of the target architecture, and so must make lowest common denominator assumptions such as an x86 system with up to 8 cores. But much higher performance may be achieved by compiling for many more cores, or an MPI-based cluster, or GPU or FPGA.
Such technology has become a hot commodity. Google bought PeakStream, Microsoft bought the assets of Interactive Supercomputing and Intel bought RapidMind and Cilk Arts. So the major IT companies are buying up this parallel processing expertise.
Multicore causes mainstream IT a problem in that most applications will struggle to scale as fast as new multicore systems do, and most programmers are not parallel processing specialists. And this problem is magnified many times over when running HPC applications in the cloud, since even if the programmer and the compilers being used could do a perfect job of optimizing and parallelizing an application, the detail target architecture is unknown.
Is there a solution? In the long term new programming paradigms or languages are required, perhaps with a two-stage compilation process that compiles to an intermediate language but postpones the final optimization and code generation until the target system is known. And no, I don't think Java is the answer.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.