NVIDIA Takes Direct Aim at High Performance Computing

By Michael Feldman

June 22, 2007

For the past year and half, NVIDIA has been putting together the product strategy for the company's high performance computing platform. On Wednesday, NVIDIA announced Tesla, a GPU product line targeted squarely at HPC customers. The new NVIDIA products are designed to act as computational accelerators for workstations and servers that host high performance technical computing applications.

Tesla represents an evolution of NVIDIA's thinking about serving HPC customers. Last year, the company entered the arena of general-purpose computing with GPUs (GPGPU) in earnest with their high-end GeForce and Quadro GPUs. For software support, they introduced their CUDA C compiler to offer relatively low-level access to the computing capabilities of their GPUs. According to NVIDIA, the CUDA tools have been downloaded by 3000 to 4000 developers since it was introduced in November 2006. For those interested in higher levels of abstraction, a GPGPU MATLAB library plug-in will soon be released.

With these early tools, technical computing users were able to demonstrate application performance increases of between 40 and 240 times compared to traditional x86 platforms. The applications ranged from neuron simulation and seismic modeling to MRI processing.

But the GeForce and Quadro products are designed mainly for visualization applications in a personal workstation or PC setup. There is no reasonable way to scale these devices across a cluster of servers to achieve a more generalized HPC solution. Nor was there a technology roadmap for NVIDIA's mainstream GPU lines that emphasized computing performance over graphics performance. Tesla now makes this possible. With the three separate GPU product lines, NVIDIA is able to target distinct application areas that reflect the company customer base. The GeForce products are geared for consumer/entertainment computing and visualization applications; the Quadro boards, for professional design and creation applications; and now the Tesla products, for traditional HPC applications.

Tesla was designed with the kind of form factors, power profiles, reliability levels and interconnect types that are compatible with high performance computing workstations and server platforms. There are three initial offerings: a 4-GPU server board, a 2-GPU workstation board, and a GPU computing processor. All the initial products will be based on the current high-end Quadro GPU, offering over 500 gigaflops of single precision performance per processor.

The Tesla S870 server board is really the big breakthrough for NVIDIA, since it represents their first product designed for the HPC datacenter. It fits in a 1U chassis, contains four GPUs, and communicates with the server host using a Gen 2 PCI Express switch. Temperature sensors and system monitoring are included to provide the level of reliability expected in datacenter hardware. The board dissipates 550 watts. Add another 10 watts for a PCI Express host adapter card. That might seem like a lot of juice for an accelerator, but for 560 watts you get over 2 teraflops of single-precision performance. MSRP for the server board is $12,000.

The Tesla server also comes in a 2-GPU version, and an 8-GPU version is in the works. The latter configuration is expected to improve upon the performance per watt ratio somewhat.

The other two initial Tesla products are targeted for workstations or PCs. The Tesla D870 is a 2-GPU board that connects to a deskside workstation. Like the server product, it connects to the host via PCI Express. The D870 uses 550 watts of power and lists for $7500. The Tesla C870 is a single 170 watt GPU processor that fits in a PCI Express slot in a workstation or PC. It lists for $1,499.

Andy Keane, general manager of GPU Computing at NVIDIA, thinks most of the company's early technical computing customers will migrate from the current GeForce and Quadro platforms to Tesla. Customers that are using the current products for both visualization and computing may stick with them if the computing side of their application doesn't outrun the GPU performance. But Tesla is clearly meant to be the future of technical computing at NVIDIA.

Although the initial offerings are based on NVIDIA's 8-series devices, as Tesla evolves it will sport its own GPU variants, which may run with faster clock speeds (but perhaps slower on-chip memory) than GPUs whose primary focus is to drive visual displays. More significantly, Keane says that double-precision floating point capability will be added to the entire Tesla product line by the end of 2007.

The addition of double-precision capability will open up the entire technical computing market for NVIDIA, since the inherent limitations of single precision arithmetic will be removed. So unless AMD comes out with a double precision GPU in the next few months, NVIDIA will be the vendor to pioneer 64-bit floating point in GPGPU computing. As such, it becomes a more direct competitor with ClearSpeed boards, a math co-processor offering that also targets the HPC market. Although NVIDIA has not released power or performance specs for their upcoming double-precision devices, one can surmise that ClearSpeed will be able to claim a performance per watt advantage, but perhaps not a performance per dollar advantage. Depending on how Intel's Larrabee processor development plays out, NVIDIA could eventually run into additional competition there as well.

In any case, there may be plenty of acceleration opportunities to go around. The commercial HPC market is growing rapidly — even faster than the general IT market. According to IDC, technical computing revenues will reach $14.2 billion by 2010. Currently, the oil & gas and financial services segments represent two of the highest growth areas right now. But manufacturing, biotech and government HPC are also expanding. NVIDIA thinks its new HPC line can ride a lot of this growth as users start to figure out that Tesla-equipped workstations can replace decent sized clusters and Tesla-equipped clusters can match the raw performance in some high-end supercomputers.

“Now when we go into an IT department and they ask us how to put GPUs into their datacenter, we have a specific answer and a product that exactly fits what they expect to buy,” says Keane.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Data Vortex Users Contemplate the Future of Supercomputing

October 19, 2017

Last month (Sept. 11-12), HPC networking company Data Vortex held its inaugural users group at Pacific Northwest National Laboratory (PNNL) bringing together about 30 participants from industry, government and academia t Read more…

By Tiffany Trader

AI Self-Training Goes Forward at Google DeepMind

October 19, 2017

DeepMind, Google’s AI research organization, announced today in a blog that AlphaGo Zero, the latest evolution of AlphaGo (the first computer program to defeat a Go world champion) trained itself within three days to play Go at a superhuman level (i.e., better than any human) – and to beat the old version of AlphaGo – without leveraging human expertise, data or training. Read more…

By Doug Black

Researchers Scale COSMO Climate Code to 4888 GPUs on Piz Daint

October 17, 2017

Effective global climate simulation, sorely needed to anticipate and cope with global warming, has long been computationally challenging. Two of the major obstacles are the needed resolution and prolonged time to compute Read more…

By John Russell

HPE Extreme Performance Solutions

Transforming Genomic Analytics with HPC-Accelerated Insights

Advancements in the field of genomics are revolutionizing our understanding of human biology, rapidly accelerating the discovery and treatment of genetic diseases, and dramatically improving human health. Read more…

Student Cluster Competition Coverage New Home

October 16, 2017

Hello computer sports fans! This is the first of many (many!) articles covering the world-wide phenomenon of Student Cluster Competitions. Finally, the Student Cluster Competition coverage has come to its natural home: H Read more…

By Dan Olds

Data Vortex Users Contemplate the Future of Supercomputing

October 19, 2017

Last month (Sept. 11-12), HPC networking company Data Vortex held its inaugural users group at Pacific Northwest National Laboratory (PNNL) bringing together ab Read more…

By Tiffany Trader

AI Self-Training Goes Forward at Google DeepMind

October 19, 2017

DeepMind, Google’s AI research organization, announced today in a blog that AlphaGo Zero, the latest evolution of AlphaGo (the first computer program to defeat a Go world champion) trained itself within three days to play Go at a superhuman level (i.e., better than any human) – and to beat the old version of AlphaGo – without leveraging human expertise, data or training. Read more…

By Doug Black

Student Cluster Competition Coverage New Home

October 16, 2017

Hello computer sports fans! This is the first of many (many!) articles covering the world-wide phenomenon of Student Cluster Competitions. Finally, the Student Read more…

By Dan Olds

Intel Delivers 17-Qubit Quantum Chip to European Research Partner

October 10, 2017

On Tuesday, Intel delivered a 17-qubit superconducting test chip to research partner QuTech, the quantum research institute of Delft University of Technology (TU Delft) in the Netherlands. The announcement marks a major milestone in the 10-year, $50-million collaborative relationship with TU Delft and TNO, the Dutch Organization for Applied Research, to accelerate advancements in quantum computing. Read more…

By Tiffany Trader

Fujitsu Tapped to Build 37-Petaflops ABCI System for AIST

October 10, 2017

Fujitsu announced today it will build the long-planned AI Bridging Cloud Infrastructure (ABCI) which is set to become the fastest supercomputer system in Japan Read more…

By John Russell

HPC Chips – A Veritable Smorgasbord?

October 10, 2017

For the first time since AMD's ill-fated launch of Bulldozer the answer to the question, 'Which CPU will be in my next HPC system?' doesn't have to be 'Whichever variety of Intel Xeon E5 they are selling when we procure'. Read more…

By Dairsie Latimer

Delays, Smoke, Records & Markets – A Candid Conversation with Cray CEO Peter Ungaro

October 5, 2017

Earlier this month, Tom Tabor, publisher of HPCwire and I had a very personal conversation with Cray CEO Peter Ungaro. Cray has been on something of a Cinderell Read more…

By Tiffany Trader & Tom Tabor

Intel Debuts Programmable Acceleration Card

October 5, 2017

With a view toward supporting complex, data-intensive applications, such as AI inference, video streaming analytics, database acceleration and genomics, Intel i Read more…

By Doug Black

Reinders: “AVX-512 May Be a Hidden Gem” in Intel Xeon Scalable Processors

June 29, 2017

Imagine if we could use vector processing on something other than just floating point problems.  Today, GPUs and CPUs work tirelessly to accelerate algorithms Read more…

By James Reinders

NERSC Scales Scientific Deep Learning to 15 Petaflops

August 28, 2017

A collaborative effort between Intel, NERSC and Stanford has delivered the first 15-petaflops deep learning software running on HPC platforms and is, according Read more…

By Rob Farber

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed, rewritten or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “ Read more…

By Tiffany Trader

Oracle Layoffs Reportedly Hit SPARC and Solaris Hard

September 7, 2017

Oracle’s latest layoffs have many wondering if this is the end of the line for the SPARC processor and Solaris OS development. As reported by multiple sources Read more…

By John Russell

US Coalesces Plans for First Exascale Supercomputer: Aurora in 2021

September 27, 2017

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, in Arlington, Va., yesterday (Sept. 26), it was revealed that the "Aurora" supercompute Read more…

By Tiffany Trader

Google Releases Deeplearn.js to Further Democratize Machine Learning

August 17, 2017

Spreading the use of machine learning tools is one of the goals of Google’s PAIR (People + AI Research) initiative, which was introduced in early July. Last w Read more…

By John Russell

GlobalFoundries Puts Wind in AMD’s Sails with 12nm FinFET

September 24, 2017

From its annual tech conference last week (Sept. 20), where GlobalFoundries welcomed more than 600 semiconductor professionals (reaching the Santa Clara venue Read more…

By Tiffany Trader

Graphcore Readies Launch of 16nm Colossus-IPU Chip

July 20, 2017

A second $30 million funding round for U.K. AI chip developer Graphcore sets up the company to go to market with its “intelligent processing unit” (IPU) in Read more…

By Tiffany Trader

Leading Solution Providers

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Amazon Debuts New AMD-based GPU Instances for Graphics Acceleration

September 12, 2017

Last week Amazon Web Services (AWS) streaming service, AppStream 2.0, introduced a new GPU instance called Graphics Design intended to accelerate graphics. The Read more…

By John Russell

EU Funds 20 Million Euro ARM+FPGA Exascale Project

September 7, 2017

At the Barcelona Supercomputer Centre on Wednesday (Sept. 6), 16 partners gathered to launch the EuroEXA project, which invests €20 million over three-and-a-half years into exascale-focused research and development. Led by the Horizon 2020 program, EuroEXA picks up the banner of a triad of partner projects — ExaNeSt, EcoScale and ExaNoDe — building on their work... Read more…

By Tiffany Trader

Delays, Smoke, Records & Markets – A Candid Conversation with Cray CEO Peter Ungaro

October 5, 2017

Earlier this month, Tom Tabor, publisher of HPCwire and I had a very personal conversation with Cray CEO Peter Ungaro. Cray has been on something of a Cinderell Read more…

By Tiffany Trader & Tom Tabor

Cray Moves to Acquire the Seagate ClusterStor Line

July 28, 2017

This week Cray announced that it is picking up Seagate's ClusterStor HPC storage array business for an undisclosed sum. "In short we're effectively transitioning the bulk of the ClusterStor product line to Cray," said CEO Peter Ungaro. Read more…

By Tiffany Trader

Intel Launches Software Tools to Ease FPGA Programming

September 5, 2017

Field Programmable Gate Arrays (FPGAs) have a reputation for being difficult to program, requiring expertise in specialty languages, like Verilog or VHDL. Easin Read more…

By Tiffany Trader

IBM Advances Web-based Quantum Programming

September 5, 2017

IBM Research is pairing its Jupyter-based Data Science Experience notebook environment with its cloud-based quantum computer, IBM Q, in hopes of encouraging a new class of entrepreneurial user to solve intractable problems that even exceed the capabilities of the best AI systems. Read more…

By Alex Woodie

HPC Chips – A Veritable Smorgasbord?

October 10, 2017

For the first time since AMD's ill-fated launch of Bulldozer the answer to the question, 'Which CPU will be in my next HPC system?' doesn't have to be 'Whichever variety of Intel Xeon E5 they are selling when we procure'. Read more…

By Dairsie Latimer

  • arrow
  • Click Here for More Headlines
  • arrow
Share This