From the Editor | Main Blog Index
July 14, 2011
The idea that the most successful technologies become invisible doesn't yet apply to GPU computing, but it's getting there. This week there were a handful of major HPC system announcements based on GPU-equipped platforms, but you wouldn't have known that from the headlines. No longer the interloper in high performance computing, GPUs are beginning to fade into the background, just like every other mainstream HPC technology.
On Monday, Bright Computing announced that Drexel University has installed a large cluster to be used for its astrophysics and molecular dynamics research. In this case large means 176 peak teraflops -- not bad for a university with less than 25 thousand students. Actually the system's peak performance is even larger than that. The 176 teraflops are attributed to 68K NVIDIA GPU cores in the machine. That works out to about 133 of the latest 512-core Tesla GPUs at 1.33 double-precision teraflops per processor. The CPUs in the system were even more invisible though; they weren't even mentioned.
Bright Computing's notable contribution here is its support for GPUs -- CUDA 4.0 specifically -- in its cluster management offering. Today, though, all cluster and workload managers support GPU computing to one extent or another. They have to, given the increasing level of penetration of GPUs in HPC clusters. The idea is to help automate the management of the GPU resources in the cluster so that the system admins don't have to treat these CPU-GPU machines like exotic animals.
On Wednesday, SGI announced Swinburne University of Technology in Australia is buying a Rackable C3108 /Altix UV combo system that will deliver 130 teraflops. Like the Drexel super, the Swinburne machine will be used for astrophysics computations. And, if you weren't paying close attention, you might not have noticed that the system will incorporate NVIDIA GPUs, in this case, a combination of Tesla C2070 and M2090 GPUs. Although no specifics were offered about the number of Tesla parts employed, it's a good bet that most of the FLOPS are from the GPU side.
Meanwhile the gang at T-Platforms was talking up the Graph 500 performance of their Lomonosov super, installed at Moscow State University. Although Lomonosov was ranked third on the list, it set a new performance record, hitting 43.5 GE/s (billion edges processed per second). The metric is an attempt to measure the ability of computers to perform data-intensive operations, rather than the TOP500 Linpack benchmark, which measures a computer's floating-point computational prowess.
Lomonosov was recently upgraded to 1.3 petaflops, thanks to -- you guessed it -- NVIDIA GPUs. In this case, the upgrade added 863 GPU teraflops (courtesy of T-Platforms' NVIDIA Tesla X2070-equipped TB2-TL blades) to Lomonosov's existing 510 teraflops. It is not clear, though, whether the GPU parts were used to achieve the record-breaking Graph 500 result.
Jumping now to China, there was the news that the Tianhe-1 supercomputer has gone into operation at the Changsha Supercomputer Center. It looks like the story originated with China Central Television (CCTV) and was subsequently picked up by the IDG News Service. The system, which is reported to reach a peak performance of 1.1 petaflops, apparently went into production last weekend. According to the report, by October the system will be upgraded to 3 petaflops.
Tianhe-1 has an odd history. It was the world's first "petascale" supercomputer that employed GPUs, in this case, AMD/ATI Radeon ATI Radeon HD 4870 2 processors. It debuted in the November 2009 TOP500 rankings as a 1.2 (peak) petaflop machine, garnering itself the number five position on the list. By November 2010, it had disappeared from TOP500, replaced by the now-famous Tianhe-1A, a much larger GPU-equipped Chinese super that delivered 4.7 peak petaflops using NVIDIA parts.
What happened to the Tianhe-1 since last November is a mystery. But given the peak petaflops has been shaved by 100 teraflops, I suspect the configuration was modified. Whether that means different GPUs, less GPUs, or no GPUs remains to be seen. If you're interested in the IDG/CCTV report, take a look at the YouTube video.
By the way, even though these CPU-GPU machines are becoming more commonplace, I've noticed that the naming convention for them has not quite settled. Some are calling them hybrid systems, while others are referring to them as heterogeneous machines. My preference is the latter, since hybrid implies a mixing of DNA, which I take to mean the processor's transistors. Since the GPUs and CPUs are still discrete entities, heterogeneous seems the better nomenclature here.
Even the AMD Fusion chips and future Project Denver processors from NVIDIA, which mix CPU and GPU components on-chip, still seem more heterogeneous than hybrid to me. But I have a feeling when GPUs are integrated to this level and, more importantly, when applications are oblivious to the mix of underlying computational units, we'll just be calling them processors again. That's what happens when technology becomes invisible.
Posted by Michael Feldman - July 14, 2011 @ 6:52 PM, Pacific Daylight Time
![]()
Michael Feldman is the editor of HPCwire.
No Recent Blog Comments
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Read more...
Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
Read more...
Supercomputing veteran, Bo Ewald, has been neck-deep in bleeding edge system development since his twelve-year stint at Cray Research back in the mid-1980s, which was followed by his tenure at large organizations like SGI and startups, including Scale Eight Corporation and Linux Networx. He has put his weight behind quantum company....
Read more...
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
Read more...
May 08, 2013 |
For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
Read more...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.