November 04, 2005
Members of Berkeley Lab's Computing Sciences divisions are applying their expertise in running scientific codes and evaluating high-performance computers to achieve "real world" assessments of leading supercomputers around the world. Their goal is to determine which architectures are best suited for advancing computational science.
With the re-emergence of viable vector computing systems such as the Earth Simulator and the Cray X1, and with IBM and DOE's Blue Gene/L taking the top spot as the world's fastest computer, there is renewed debate about which architecture is best suited for running large-scale scientific applications.
In order to cut through conflicting claims, researchers from Berkeley Lab's Computational Research and NERSC Center divisions have been putting various architectures through their paces, running benchmarks as well as scientific applications key to Department of Energy programs. The team includes Lenny Oliker, Julian Borrill, Andrew Canning and John Shalf of CRD; Jonathan Carter and David Skinner of NERSC; and Stephane Ethier of the Princeton Plasma Physics Laboratory. Their evaluations have resulted in a half-dozen papers published in journals and presented at conferences in the United States, Norway, Japan and Spain.
In the initial part of their study, the team traveled to Japan in December, 2004 and put five different systems through their paces, running four different scientific applications key to DOE research programs. As part of the effort, the group became the first international team to conduct a performance evaluation study of the 5,120-processor Earth Simulator.
The team also assessed the performance of
"This effort relates to the fact that the gap between peak and actual performance for scientific codes keeps growing," said team leader Lenny Oliker. "Because of the increasing cost and complexity of HPC systems" -- high-performance computing systems -- "it is critical to determine which classes of applications are best suited for a given architecture."
The four applications and research areas selected by the team for the evaluation were
"The four applications successfully ran on the Earth Simulator with high parallel efficiency," Oliker said. "And they ran faster than on any other measured architecture -- generally by a large margin." However, Oliker added, only codes that scale well and are suited to the vector architecture may be run on the Earth Simulator. "Vector architectures are extremely powerful for the set of applications that map well to those architectures," Oliker said. "But if even a small part of the code is not vectorized, overall performance degrades rapidly."
One of the codes, LBMHD, ran at 67 percent of peak system performance, even when scaled up to 4,800 processors. However, as with most scientific inquiries, the ultimate solution to the problem is neither simple nor straightforward.
"We're at a point where no single architecture is well suited to the full spectrum of scientific applications," Oliker said. "One size does not fit all, so we need a range of systems. It's conceivable that future supercomputers would have heterogeneous architectures within a single system, with different sections of a code running on different components."
One of the codes the group intended to run in this study -- MADCAP, the Microwave Anisotropy Dataset Computational Analysis Package -- did not scale well enough to be used on the Earth Simulator. MADCAP, developed by Julian Borrill, is a parallel implementation of cosmic microwave background map-making and power spectrum estimation algorithms. Since MADCAP has high input-output requirements, its performance was hampered by the lack of a fast global file system on the Earth Simulator.
Undeterred, the team retuned MADCAP and returned to Japan to try again. The results, outlined in a paper titled "Performance characteristics of a cosmology package on leading HPC architectures" and presented at the 11th International Conference on HPC in Bangalore, India, found that the Cray X1 had the best runtimes for MADCAP but suffered the lowest parallel efficiency. The Earth Simulator and IBM Power3 demonstrated the best scalability, and the code achieved the highest percentage of peak on the Power3. The paper concluded, "Our results highlight the complex interplay between the problem size, architectural paradigm, interconnect, and vendor-supplied numerical libraries, while isolating the I/O filesystem as the key bottleneck across all the platforms."
Blue Gene/L is currently the world's fastest supercomputer, with the first Blue Gene system being installed at Lawrence Livermore National Laboratory. David Skinner is serving as Berkeley Lab's representative to a new BlueGene/L Consortium led by Argonne National Laboratory. The consortium aims to pull together a group of institutions active in HPC research, collectively building a community focused on the Blue Gene family as a next step towards petascale computing. This consortium will work together to develop or port Blue Gene applications and system software, conduct detailed performance analysis on applications, develop mutual training and support mechanisms, and contribute to future platform directions.
This is a reprint of an article originally published by Berkeley Lab Computing Sciences
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
Read more...
May 08, 2013 |
For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
Read more...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.