Visit additional Tabor Communication Publications
October 28, 2010
The end of the US dominance at the top of the TOP500 appears to be at hand. Tianhe-1A, a new Chinese supercomputer powered by over 7,000 NVIDIA Tesla GPUs has recorded a Linpack score of 2.507 petaflops. That would beat out Oak Ridge National Lab's 1.759 petaflop Jaguar machine, the current TOP500 title holder, by a wide margin.
It would also be the first time a non-US supercomputer held the number one spot in six years. From June 2002 to June 2004, Japan's Earth Simulator was the fastest supercomputer in the world. In September 2004, it yielded its title to the new kid on the block -- IBM's Blue Gene/L. The US has never looked back.
The concentrated performance available in high end discrete GPUs has opened up the petaflop club for a lot more players. But to get to the top of the heap, you need thousands of GPUs. Tianhe-1A, for example, sports 7,168 of them, in this case, NVIDIA Tesla M2050 (Fermi) GPUs. These represent the lion's share of FLOPS in the system, despite the presence of accompanying 14,336 CPUs. Tianhe-1A also comes with 262 TB of memory and 2 PB of Lustre-based storage.
Although the Linpack performance is a stunning 2.5 petaflops, the system left a lot of potential FLOPS in the machine. Its peak performance is 4.7 petaflops, yielding a Linpack efficiency of just over 50 percent. To date, this is a rather typical Linpack yield for GPGPU-accelerated supers. Because the GPUs are stuck on the relatively slow PCIe bus, the overhead of sending calculations to the graphics processors chews up quite a few cycles on both the CPUs and GPUs.
By contrast, the CPU-only Jaguar has a Linpack/peak efficiency of 75 percent. Even so, Tianhe-1A draws just 4 megawatts of power, while Jaguar uses nearly 7 megawatts and yields 30 percent less Linpack.
Despite the exotic nature and stature of Tianhe-1A system, it is targeted for typical high performance computing applications including oil exploration, equipment development, biomedical research, animation design, weather forecasting, financial risk analysis, remote sensing, materials research, and the like.
Like its immediate ancestor, Tianhe-1, the Tianhe-1A system was developed by the National University of Defense Technology (NUDT) and is being housed at National Supercomputer Center in Tianjin. Tianhe-1, though, was built using AMD GPUs, specifically 2,560 dual-GPU ATI Radeon HD 4870 X2 processors. That system was launched in 2009. Somewhere along the line, NUDT decided to switch horses and go with NVIDIA gear. Since there is currently no CUDA port for AMD GPUs, software compatibility between the two systems is going to problematic, unless they go the OpenCL route.
China's enthusiasm for GPGPUs has propelled its supercomputing capacity significantly. Even before Tianhe-1A, the country claimed three systems in the top 20: Nebulae at number 2, Tianhe-1 at number 7, and Mole 8.5 at number 19, all of which use GPUs. The US, Germany, and the UK currently have no GPU-equipped systems on the TOP500.
Although the official list results won't be revealed until the middle of November, it's doubtful if a secret supercomputer is lying in wait ready to challenge a 2.5 petaflop machine. But we'll find out soon enough.
Posted by Michael Feldman - October 28, 2010 @ 12:40 AM, Pacific Daylight Time
Michael Feldman is the editor of HPCwire.
No Recent Blog Comments
In a recent solicitation, the NSF laid out needs for furthering its scientific and engineering infrastructure with new tools to go beyond top performance, Having already delivered systems like Stampede and Blue Waters, they're turning an eye to solving data-intensive challenges. We spoke with the agency's Irene Qualters and Barry Schneider about..
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
May 22, 2013 |
At some point in the not-too-distant future, building powerful, miniature computing systems will be considered a hobby for high schoolers, just as robotics or even Lego-building are today. That could be made possible through recent advancements made with the Raspberry Pi computers.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.