The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
November 19, 2008
Researchers at Tohoku University in Sendai, north-eastern Japan, announced on Wednesday that they had broken a batch of performance records on their NEC SX-9 supercomputer, as measured on the HPC Challenge Benchmark test. Hiroaki Kobayashi, director the university's Cyberscience Center, said the SX-9 had achieved the highest marks ever in 19 of 28 areas the test evaluates in computer processing, memory bandwidth and networking bandwidth. The scores were matched against those previously achieved on the same independent benchmark test by other leading supercomputers, including IBM's Blue Gene/L, Cray's XT3/4 and SGI's Altix ICE, with the SX-9 coming out on top 64 percent of the time.
The news comes at a good time for NEC. The Tokyo-based manufacturer of vector-based supercomputers is battling in a market that has been moving away from its expensive high-performance vector processing models to systems that use more modestly priced commodity-type superscalar CPUs. These cheaper chips can be coupled tightly together or used in clusters of computers to achieve similar or better results than vector competitors -- at least in some areas of supercomputing.
At Tohoku University, however, a stronghold of vector computing since it installed its first SX-1 in 1985, Director Kobayashi argues that vector computing is essential for certain types of applications and will only increase in importance as advances are made in parallel processing.
"In the future, data parallel processing will become more important in high performance computing," says Kobayashi. "And vector processing provides a very efficient model for it." This is why, he adds, Intel, which has long provided short vector SIMD code extensions for its x86 architecture, is employing wider vector operations in its upcoming Larrabee graphics processing chip. "Regarding parallel processing, at the instruction-set level, vector instruction sets are the key to future processors, no matter what kind of micro-architecture is used," says Kobayashi."
In addition, he emphasizes that for the kind of programs that the 1,500 paying supercomputer users of the University's Cyberscience Center want to run, vector is still king. Most of these users are involved in government and academic research programs in areas like aerospace, environmental simulations, structural analysis and nanotechnology. "They want to conduct very large simulations, so are looking for an efficient handling mechanism to process extremely large amounts of data in a single operation," says Kobayashi. "Vector processing is best suited to this kind of application."
The SX-9 employs a single-chip vector processor capable of reaching 102 GFLOPS. Up to 16 CPUs sharing 1 TB of memory can be incorporated on a single node, combing to produce 1.6 TFLOPS of peak performance. The Tohoku University SX-9 set-up, which began operations this April, consists of 16 nodes, each of 16 CPUs, producing an overall peak performance of 26 TFLOPS. On a sustained performance bases, the Cyberscience Center's test results show a single SX-9 CPU outperforms that of the previous SX-8R by between four to eight times, depending on the application.
Much of the new CPU's improved performance can be accounted for by the addition of an arithmetic unit and raising the number of vector pipelines -- all integrated on a single chip that is the first to surpass 100 GFLOPS.
But Kobayashi notes that a new feature of the SX-9, the inclusion of an assignable data buffer or ADB, has also helped boost performance significantly. "ADB is software-controllable cache memory," he explains. "It lets the user assign the data to be cached, which prevents it from being evicted."
In a simulation used to detect the presence of land mines with electromagnetic waves, for instance, performance increased by 20 percent when ADB was used. In another simulation, which tracked the movement of tectonic plates (the cause of earthquakes), the use of ADB improved performance by 75 percent, while a simulation involving the physics of plasma under certain conditions saw performance jump two times when employing ADB.
Despite such gains, Kobayashi has a gripe with the current ADB design: the cache space is limited to just 256 kilobytes. This means users cannot place all the target data in the cache; rather, they must select only the portion that they judge will work most effectively in ADB. To determine the optimum amount of cache memory, the Cyberscience Center, which is developing a software simulator based on the SX-9 architecture to design future supercomputer models, ran simulations using real application code. To achieve the highest performance, the researchers found that a minimum of 8 MB of ADB memory is necessary. NEC has been so advised.
Page: 1 of 2(Digg, Technorati, more)
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
The ACM Turing Award goes to the creator of the modern personal computer; and Voltaire announces a mid-range InfiniBand switch and new technology that accelerates distributed applications. We recap those stories and more in our weekly wrapup.
Read More...
The prospects for virtual SMP technology got another boost last month when Florida State University announced it had installed a new HPC system from 3Leaf Systems. The servers are being housed at the university's HPC facility and will be used across a range of scientific disciplines.
Read More...
For the first time in 62 years, the four-man Olympics bobsled team from the US captured the gold medal, setting a course world record in the process. The winning bobsled had some state-of-the-art engineering behind it, including CFD software from Exa Corporation. As it turned out, that software may have proved to be the margin of difference in the race.
Read More...
Mar 11 | Linux Magazine | CUDA may be the rage, but OpenCL is a standard that has some features you may need. Read more...
Mar 09 | Free Software Magazine | Data-driven computing will need open software. Read more...
Mar 09 | Bio-IT World | Tahoe Informatics founder eyes GPUs, CUDA software. Read more...
Mar 08 | Sporting Life | Formula One engineers differ on benefits of CFD. Read more...
Mar 08 | InfoWorld | AMD offers up 48-core server prize. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html