February 21, 2011
Exascale computing promises incredible science breakthroughs, but it won't come easily, and it won't come free. That's the premise of a feature story from the DOE's Office of Advanced Scientific Computing Research, whose mission it is "to discover, develop, and deploy the computational and networking tools that enable researchers in the scientific disciplines to analyze, model, simulate, and predict complex phenomena important to the Department of Energy."
The article makes the case for exascale computing, citing some of the scientific breakthroughs that such a leap would enable, such as precise long-range weather forecasting, innovative alternative fuels and advances in disease research. The ability to represent many more variables will lead to more realistic models. For example, future researchers will be able to create a global climate model with a level of resolution that is now only possible for regional studies.
There are three main obstacles standing in the way of tomorrow's exascale behemoths, and according to Rick Stevens, Argonne National Laboratory associate director for computing, environmental and life science and a University of Chicago computer science professor, all are potential showstoppers.
The current exascale model predicts a machine with a billion cores. So the first challenge is creating software that can take advantage of all of them. This is parallelism in the extreme. Applications have been developed that can achieve 250,000-way parallelism, but exaflop-class machines will be called upon to exhibit 1-billion-way parallelism.
Another daunting concern is power. Stevens says that a 1 billion-processor computer made with today's technology would consume more than a gigawatt of electricity. According to the DOE's Energy Information Administration, the top US utility plants generate only a few gigawatts with most producing less than four. That means that a single exascale machine could necessitate its own power plant. GPU computing is being looked at as a potential way to curb energy demands.
The enormous increase in the number of processing cores is what leads to the third major challenge, reliability. Whatever reliability issues exist in a modern system will be magnified a thousand fold, such that, according to Stevens, "If you just scale up from today's technology, an exascale computer wouldn't stay up for more than a few minutes at a time." Practically-speaking, a machine's mean failure rate must be about a week or more. To illustrate, Lawrence Livermore National Laboratory's IBM BlueGene/L fails about once every two weeks.
At the heart of all these challenges is funding, specifically government funding. This is the fundamental factor on which the success or failure of exaflop-level computing hinges. As explained in the article, scientific computing is a niche market, not sustained by the overall IT industry, which is driven by consumer electronics innovations. Therefore, "complex and coordinated R&D efforts [are required] to bring down the cost of memory, networking, disks and all of the other essential components of an exascale system."
Full story at DOE's Office of Advanced Scientific Computing Research
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Read more...
Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
Read more...
Supercomputing veteran, Bo Ewald, has been neck-deep in bleeding edge system development since his twelve-year stint at Cray Research back in the mid-1980s, which was followed by his tenure at large organizations like SGI and startups, including Scale Eight Corporation and Linux Networx. He has put his weight behind quantum company....
Read more...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.