Oakridge Top Right
HPCwire

Since 1986 - Covering the Fastest Computers
in the World and the People Who Run Them

Language Flags

Visit additional Tabor Communication Publications

Datanami
Digital Manufacturing Report
HPC in the Cloud
Green Computing Report

Tabor Communications
Corporate Video

Velocity Micro Makes an HPC Play


The GPGPU phenomenon is continuing to attract lots of attention in the high performance computing community and is starting to bring some new players into the market. The introduction of commodity GPU processors offering teraflop-level performance suggests supercomputing can now be had for near-PC prices. The current challenge is to package those GPUs so that their power can be tapped by the average HPC practitioner.

The latest attempt at this comes from Velocity Micro, which until this week was known for its bleeding-edge PCs and desktop systems for power users, especially gaming enthusiasts. On Monday, the company jumped into the HPC market by launching a new line of NVIDIA Tesla GPU-accelerated HPC workstations. The products consist of customizable desktop systems based on Intel CPUs and NVIDIA's newest C1060 card. The C1060 is based on NVIDIA's 10-series GPU, which offers almost a teraflop of peak single precision performance (and around 100 gigaflops of double precision). With 4 GB of local memory, the C1060 has more than twice the capacity of NVIDIA's first generation C870.

The Velocity workstations, which range in price from $3,995 to $16,995, come preloaded with the CUDA SDK (NVIDIA's C programming framework for GPU computing), along with either Window XP or Fedora Core 8. The hardware is available in three basic configurations: an entry level system containing a dual- or quad-core Intel Core 2 processor and an optional C1060 Tesla card; a mid-level system with almost the same Intel CPU options, but up to two C1060s; and a high-end box with single- or dual-socket Xeon quad-core CPUs and up to three C1060s. The company rates the three configurations at 1, 2 and 3 teraflops, respectively, with the GPU card or cards providing most of the horsepower. NVIDIA Quadro GPUs are also available to drive video, enabling GPU computing and visualization to take place simultaneously.

The three-teraflop configuration with dual qual-core Xeons, three Tesla cards and a Quadro GPU, consumes plenty of juice. In an attempt to max out the system, the Velocity engineers have been able to drive power consumption up to 950 watts, and that's probably the most a real-world application would consume. The systems are all air-cooled, presumably very effectively, since hot chips are standard gear on most Velocity systems. In fact, for the company's high-end consumer boxes, overclocking is fairly common, although not for the HPC product line.

The GPU-equipped machines are designed for typical HPC end-users: scientists, engineers and other technical analysts. Since HPC is new territory for Velocity, the company has partnered with James River Technical (JRT), a reseller that specializes in the HPC market. JRT facilitates deals for vendors like SGI, especially for the higher education and government markets. The Velocity-JRT partnership is an especially nice fit here since the lowest hanging fruit for these new workstations is likely to be researchers at universities and government labs.

These types of users have already shown a lot of interest in GPU-accelerated computing and are on the lookout for production-ready systems. According to JRT president Tom Mountcastle, many of their customers are constrained more by budget, than imagination. "This appeals to the research community because they like being out there on the edge," he said.

On the other hand, since the machines will be on people's desktops, the big government labs and the universities aren't interested in inexpensive systems that lack vendor support, which chews up a lot of system administration time. In this area, Velocity has a good track record. Over the years, the company has collected numerous award for craftsmanship, service and reliability from the likes of PC Magazine, CNET, and PC World.

The company is also among the first, if not the first, to take advantage of the latest hardware technology for its consumer products. In that sense, it sees the new Tesla hardware and CUDA as a game-changer for HPC. From Velocity's perspective, NVIDIA's introduction of the more powerful 10-series GPUs and the maturity of the CUDA software stack indicate that the technology pieces are now in place for a commercially-viable high performance PC. "We've determined there is a hole in the market for entry-level high performance computing and that's where our product will be focused," said Randy Copeland, Velocity Micro's CEO and president.

CUDA, in particular, seems to have reached a critical mass. A quick tour of NVIDIA's CUDA site reveals dozens of academic codes and a smattering of commercial applications and libraries that have been accelerated. Application areas include the usual HPC verticals: finance, life sciences, oil & gas, EDA, digital content creation and basic science research. A number of bindings and libraries are also now available so that Python, MATLAB, and other environments can tap into GPGPU.

Now with the 10-series Tesla products due to be released this month, OEMs and integrators can construct GPU-equipped servers and desktop boxes with double-precision floating point support. Presumably, workstation vendors like Dell and HP could build accelerated HPC desktop systems, but since the demand for these machines is still largely unknown, these firms will probably be content to watch more specialized companies like Velocity from the sidelines. Likewise, IBM could develop an equivalent Cell-BE based workstation, but the market for such a system is likely to be much more constrained than ones based on the more ubiquitous GPU.

It was less than a month ago that Cray introduced its own entry-level supercomputer, the CX1. Whereas the Velocity offering is essentially an SMP machine with GPU accleration, the CX1 is the more traditional cluster architecture, but scaled down for personal use. JRT, which sells both systems, seems to be covering its bases here. It's quite possible both machines can find their own niches -- the CX1 for more traditional MPI-based applications and the Velocity boxes for more global address apps that lend themselves to acceleration. The CX1 is also in a higher price band, with the least expensive configuration starting at $25,000 -- about $10,000 more than the top-of-the-line Velocity machine.

Even though the first Velocity systems just hit the streets this week, the company already has a second generation in the works. They intend to quickly move to four-socket CPU configurations, and will incorporate the Nehalem processor when it becomes available later this year. Further down the road, it may be possible to hook the workstations together for applications requiring greater scale.

If Velocity Micro can make a go of this, the "Attack of the Killer Micros" saga will have added a new chapter. Instead of just commodity microprocessor hardware invading HPC's turf, PC vendors themselves could start eating into the market from the bottom up. Meanwhile, it will be interesting to see if any other desktop vendors are tempted to jump into the HPC arena.

Sponsored Links

High-Performance Computing in Action
Businesses that want to be on the cutting edge of their industries are increasingly turning to high-performance computing (HPC) solutions to handle complex compute processes and speed up their rate of innovation. Download this Executive Brief to see how businesses in energy, life sciences and entertainment put HPC solutions to work in their operations.

Webinar: Programming Heterogeneous X64+GPU Systems Using OpenACC
Join Michael Wolfe as he compares the advantages and costs of using both low-level models and the directive-based OpenACC model for programming accelerated heterogeneous systems. Registration is free.

Accelerate your science with Seneca
One of the first HPC providers installing a 4X NVIDIA Kepler K-20 cluster. Invites you to a free evaluation on Seneca’s NVIDIA K20 Kepler cluster, pre-loaded with AMBER, NAMD, LAMMPS

May 24, 2013

May 23, 2013

May 22, 2013

May 21, 2013

May 20, 2013

May 17, 2013

May 16, 2013

May 15, 2013

May 14, 2013

May 13, 2013


Most Read Features

Most Read Around the Web

Most Read This Just In


Short Takes

NASA Builds 'Climate in a Box'

May 23, 2013 | The study of climate change is one of those scientific problems where it is almost essential to model the entire Earth to attain accurate results and make worthwhile predictions. In an attempt to make climate science more accessible to smaller research facilities, NASA introduced what they call ‘Climate in a Box,’ a system they note acts as a desktop supercomputer.
Read more...

Building Supercomputers with Raspberries

May 22, 2013 | At some point in the not-too-distant future, building powerful, miniature computing systems will be considered a hobby for high schoolers, just as robotics or even Lego-building are today. That could be made possible through recent advancements made with the Raspberry Pi computers.
Read more...

Running Computational Fluid Dynamics in the Cloud

May 16, 2013 | When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...

Computing the Physics of Bubbles

May 15, 2013 | Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...

Sponsored Whitepapers

Best Practices in Big Data Storage

05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.

Progress in Parallel: the Bull Parallel Programming Center

04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.

Sponsored Multimedia

SGI DMF ZeroWatt Disk Solution

In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.

Cray CS300-AC Cluster Supercomputer Air Cooling Technology Video

The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.

SC12 Editorial Feature HPCwire Soundbite sponsored by ISC Xyratex

HPC Job Bank


Featured Events


  • June 16, 2013 - June 20, 2013
    ISC'13
    Leipzig,
    Germany

  • June 17, 2013 - June 18, 2013
    Forecast 2013
    San Francisco, CA
    United States





HPCwire Events