Startup Uses Data Compression to Speed Applications

By Michael Feldman

January 25, 2012

Silicon Valley-based Samplify Systems has launched an application acceleration technology designed to speed up codes that sling a lot of numerical data. But rather than throwing bigger, faster hardware at the problem, the company aims to make programs speedier by optimizing the data flow between the compute cores and the outside world.

Samplify, whose roots are in signal compression has extended the technology to address all numerically-intensive applications. For HPC users, Samplify’s heart is certainly in the right place. In high performance computing, application acceleration is the thirst that can never to be quenched.

The current performance issue in the industry is related to multicore designs. Processors are getting more powerful at a Moore’s Law clip thanks to a proliferation of cores, while bandwidth of external subsystems like memory and I/O is increasing much more slowly (and in discrete steps). That imbalance is the principle reason that HPC applications typically utilize just a small fraction of their hardware hosts.

For example, a science code called ECCO (Estimating Circulation and Climate of Ocean), one of the workhouse applications on NASA’s Pleiades supercomputer, uses only about 1.4 percent of the 1.3 petaflop peak performance of the machine. That’s due mainly to the fact that the CPUs are spending the majority of their time idly waiting for data to arrive. Unfortunately, that scenario is not much of an outlier. According to a research study at NASA, most of the applications on Pleiades have sustained performance in the 3 to 8 percent range.

Such stark inefficiencies are what drove Samplify to come up with its application acceleration offering, known as APAX. In a nutshell, APAX compresses numerical data (both integer and floating point) that flows through a system, thereby increasing data throughput. And it does so in a manner that is transparent to the application.

The technology is being offered in both hardware and software forms and can be deployed in a variety of ways. Specifically, the compression technology can be inserted at all the usual choke points in a computer system — memory, I/O, networks and storage — in order to attack application bottlenecks at their source.

According to Al Wegener, Samplify founder and CTO, most HPC users are not aware that there are a significant number of bits that can be squeezed out of their codes. In today’s supercomputing culture, the traditional solution to data bandwidth constraints has been to over-provision the hardware. “All of the HPC customers we talked to have never even thought of using compression for their applications,” Wegener told HPCwire.

The APAX technology supports both lossless and lossy compression schemes. Under APAX, the lossless scheme usually attains at least a 2:1 compression, effectively doubling bandwidth at the intended choke point. With lossy schemes, which are under user control, APAX compression can easily hit 4:1 and go as high as 8:1.

Here though, the user has to be somewhat careful, since lossy schemes rely on trimming off some of the numerical precision. In general, application programmers tend to be a bit lazy with precision, preferring to use generic data types in their codes and gravitating towards double precision floating point. But casting 12-bit integer inputs into 64-bit floating point values for the sake of convenience doesn’t magically increase the accuracy of the results and ends up wasting a lot of bits. In working with trial customers, Samplify has found that most applications can tolerate lossy compression in the 4:1 to 6:1 range before the results start to diverge.

From a performance per watt perspective, APAX hardware is probably the most efficient way to go. For example, if a chipmaker wanted to insert compression into its memory controller design, it would simply license the APAX IP block (a couple of hundred logic gates) from Samplify.

Once in the controller, the compression logic, along with compatible drivers, would squeeze the bits sent from the compute cores such that all numerical data would be stored in DRAM in a compressed format. When reading from memory, the same logic would decompress the data before passing it back to the number crunching silicon. Assuming 2:1 compression, memory bandwidth for all numerical data traffic would be doubled. Conveniently, it would also double the effective memory storage.

In an HPC environment, that can add up quickly. Wegener offers the case of NVIDIA’s Tesla GPU devices. Using 2:1 compression, the GPU card’s 6 GB of GDDR5 memory turn into 12 GB of effective storage. Likewise, the 150 GB/sec of bandwidth becomes 300 GB/sec. “I think that would be a big deal,” says Wegener.

Other likely targets for the compression technology would be network adapters, such as InfiniBand or Ethernet NICs, storage controllers, and Southbridge chips. Along with modified drivers, the compression-spiked ASICs would be able to turbo-charge data performance across a system, cluster or even a whole datacenter.

For HPC applications on existing hardware, the most straightforward method is to insert the APAX software into existing applications or wrap it around MPI libraries. This could be especially useful on more generic cloud infrastructure, such as what Amazon offers, where network capability and topology is much less conducive to HPC communication compared to a purpose-built supercomputer.

While the technology Samplify is offering is not a panacea for all these data bottlenecks, it has the potential to make a significant dent in throughput and storage. Right now the company is in the process of collecting proof points for the technology. According the Wegener, APAX has been validated by two Samplify investors: Schlumberger, an oil & gas exploration firm, and Mamiya, a Japanese manufacture of high-end digital cameras. In the case of Schlumberger, the technology is being used in its software incarnation, while Mamiya has inserted the APAX into its FPGA chips. Other trials are in process with seismic and multiphysics customers, but the company is not willing to name names at this point.

Samplify envisions a market for APAX in high performance and cloud computing as well as at the other end of the IT spectrum in mobile computing devices and consumer electronics. The company estimates a total addressable market of $700 million by 2014: $370 million for APAX IP blocks (Verilog RTL) on 1.8 billion devices and $330 million for APAX software on 16.7 million cores.

As of this week, the APAX technology is ready to ship in software form. The hardware IP block will be available for licensing in the middle of the year. Pricing has not been disclosed.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Democratization of HPC Part 3: Ninth Graders Tap HPC in the Cloud to Design Flying Boats

October 18, 2018

This is the third in a series of articles demonstrating the growing acceptance of high-performance computing (HPC) in new user communities and application areas. In this article we present UberCloud use case #208 on how Read more…

By Wolfgang Gentzsch and Håkon Bull Hove

Penguin Computing Launches Consultancy for Piecing AI Strategies Together

October 18, 2018

AI stands before the HPC industry as a beacon of great expectations, yet market research repeatedly shows that AI adoption is commonly stuck in the talking phase, on the near side of a difficult chasm to cross. In respon Read more…

By Tiffany Trader

When Water Quality—Not Quantity—Hinders HPC Cooling

October 18, 2018

Attention has been paid to the sheer quantity of water consumed by supercomputers’ cooling towers – and rightly so, as they can require thousands of gallons per minute to cool. But in the background, another factor can emerge, bottlenecking efficiency and raising costs: water quality. Read more…

By Oliver Peckham

HPE Extreme Performance Solutions

One Small Step Toward Mars: One Giant Leap for Supercomputing

Since the days of the Space Race between the U.S. and the former Soviet Union, we have continually sought ways to perform experiments in space. Read more…

IBM Accelerated Insights

Paper Offers ‘Proof’ of Quantum Advantage on Some Problems

October 18, 2018

Is quantum computing worth all the effort being poured into it or should we just wait for classical computing to catch up? An IBM blog today posed those questions and, you won’t be surprised, offers a firm “it’s wo Read more…

By John Russell

Penguin Computing Launches Consultancy for Piecing AI Strategies Together

October 18, 2018

AI stands before the HPC industry as a beacon of great expectations, yet market research repeatedly shows that AI adoption is commonly stuck in the talking phas Read more…

By Tiffany Trader

When Water Quality—Not Quantity—Hinders HPC Cooling

October 18, 2018

Attention has been paid to the sheer quantity of water consumed by supercomputers’ cooling towers – and rightly so, as they can require thousands of gallons per minute to cool. But in the background, another factor can emerge, bottlenecking efficiency and raising costs: water quality. Read more…

By Oliver Peckham

Paper Offers ‘Proof’ of Quantum Advantage on Some Problems

October 18, 2018

Is quantum computing worth all the effort being poured into it or should we just wait for classical computing to catch up? An IBM blog today posed those questio Read more…

By John Russell

Dell EMC to Supply U Michigan’s Great Lakes Cluster

October 16, 2018

The University of Michigan (U-M) today announced Dell EMC is the lead vendor for U-M’s $4.8 million Great Lakes HPC cluster scheduled for deployment in first Read more…

By John Russell

Houston to Field Massive, ‘Geophysically Configured’ Cloud Supercomputer

October 11, 2018

Based on some news stories out today, one might get the impression that the next system to crack number one on the Top500 would be an industrial oil and gas mon Read more…

By Tiffany Trader

Nvidia Platform Pushes GPUs into Machine Learning, High Performance Data Analytics

October 10, 2018

GPU leader Nvidia, generally associated with deep learning, autonomous vehicles and other higher-end enterprise and scientific workloads (and gaming, of course) Read more…

By Doug Black

Federal Investment in Exascale – What It Really Means

October 10, 2018

Earlier this month, the EuroHPC JU (Joint Undertaking) reached critical mass, and it seems all EU and affiliated member states, bar the UK (unsurprisingly), have or will sign on. The EuroHPC JU was born from a recognition that individual EU member states, and the EU as a whole, were significantly underinvesting in HPC compared to the US, China and Japan, who all have their own exascale investment and delivery strategies (NSCI, 13th 5 Year Plan, Post-K, etc). Read more…

By Dairsie Latimer

NERSC-9 Clues Found in NERSC 2017 Annual Report

October 8, 2018

If you’re eager to find out who’ll supply NERSC’s next-gen supercomputer, codenamed NERSC-9, here’s a project update to tide you over until the winning bid and system details are revealed. The upcoming system is referenced several times in the recently published 2017 NERSC annual report. Read more…

By Tiffany Trader

TACC Wins Next NSF-funded Major Supercomputer

July 30, 2018

The Texas Advanced Computing Center (TACC) has won the next NSF-funded big supercomputer beating out rivals including the National Center for Supercomputing Ap Read more…

By John Russell

IBM at Hot Chips: What’s Next for Power

August 23, 2018

With processor, memory and networking technologies all racing to fill in for an ailing Moore’s law, the era of the heterogeneous datacenter is well underway, Read more…

By Tiffany Trader

Requiem for a Phi: Knights Landing Discontinued

July 25, 2018

On Monday, Intel made public its end of life strategy for the Knights Landing "KNL" Phi product set. The announcement makes official what has already been wide Read more…

By Tiffany Trader

CERN Project Sees Orders-of-Magnitude Speedup with AI Approach

August 14, 2018

An award-winning effort at CERN has demonstrated potential to significantly change how the physics based modeling and simulation communities view machine learni Read more…

By Rob Farber

House Passes $1.275B National Quantum Initiative

September 17, 2018

Last Thursday the U.S. House of Representatives passed the National Quantum Initiative Act (NQIA) intended to accelerate quantum computing research and developm Read more…

By John Russell

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

By John Russell

D-Wave Breaks New Ground in Quantum Simulation

July 16, 2018

Last Friday D-Wave scientists and colleagues published work in Science which they say represents the first fulfillment of Richard Feynman’s 1982 notion that Read more…

By John Russell

Leading Solution Providers

HPC on Wall Street 2018 Booth Video Tours Playlist

Arista

Dell EMC

IBM

Intel

RStor

VMWare

TACC’s ‘Frontera’ Supercomputer Expands Horizon for Extreme-Scale Science

August 29, 2018

The National Science Foundation and the Texas Advanced Computing Center announced today that a new system, called Frontera, will overtake Stampede 2 as the fast Read more…

By Tiffany Trader

HPE No. 1, IBM Surges, in ‘Bucking Bronco’ High Performance Server Market

September 27, 2018

Riding healthy U.S. and global economies, strong demand for AI-capable hardware and other tailwind trends, the high performance computing server market jumped 28 percent in the second quarter 2018 to $3.7 billion, up from $2.9 billion for the same period last year, according to industry analyst firm Hyperion Research. Read more…

By Doug Black

Intel Announces Cooper Lake, Advances AI Strategy

August 9, 2018

Intel's chief datacenter exec Navin Shenoy kicked off the company's Data-Centric Innovation Summit Wednesday, the day-long program devoted to Intel's datacenter Read more…

By Tiffany Trader

GPUs Power Five of World’s Top Seven Supercomputers

June 25, 2018

The top 10 echelon of the newly minted Top500 list boasts three powerful new systems with one common engine: the Nvidia Volta V100 general-purpose graphics proc Read more…

By Tiffany Trader

Germany Celebrates Launch of Two Fastest Supercomputers

September 26, 2018

The new high-performance computer SuperMUC-NG at the Leibniz Supercomputing Center (LRZ) in Garching is the fastest computer in Germany and one of the fastest i Read more…

By Tiffany Trader

MLPerf – Will New Machine Learning Benchmark Help Propel AI Forward?

May 2, 2018

Let the AI benchmarking wars begin. Today, a diverse group from academia and industry – Google, Baidu, Intel, AMD, Harvard, and Stanford among them – releas Read more…

By John Russell

Aerodynamic Simulation Reveals Best Position in a Peloton of Cyclists

July 5, 2018

Eindhoven University of Technology (TU/e) and KU Leuven research group conducts the largest numerical simulation ever done in the sport industry and cycling discipline. The goal was to understand the aerodynamic interactions in the peloton, i.e., the main pack of cyclists in a race. Read more…

Houston to Field Massive, ‘Geophysically Configured’ Cloud Supercomputer

October 11, 2018

Based on some news stories out today, one might get the impression that the next system to crack number one on the Top500 would be an industrial oil and gas mon Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This