The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
May 06, 2008
Since InfiniBand came onto the scene, users have focused their efforts on using the high performance network fabric to connect compute and storage boxes within the datacenter. But a couple of enterprising companies, Network Equipment Technologies and Obsidian Research Corp., have developed InfiniBand connectivity for wide area networks (WANs). In both instances the vendors have developed solutions that can transparently connect IB clusters and storage over long distances -- hundreds or thousands of miles. From the application's viewpoint, the remote compute and storage nodes look and (more or less) act as if they're sitting right next to each other.
The benefits of long distance InfiniBand mirror its advantages in the datacenter -- namely high bandwidth and low latency. While the WAN InfiniBand performance won't always match local performance, solutions have demonstrated user data rates of up to 8 Gbps over thousands of miles across SONET OC-192 or 10 GbE backbones. Bandwidth and latency tends to drop a bit the farther you go, but unlike TCP/IP implementations, Quality of Service (QoS) is maintained.
Obsidian's Longbow IB WAN solution has been deployed at NASA Ames, Arizona State University and the University of Florida, and is being researched by Oak Ridge National Laboratory and Ohio State. The Longbow product has also been a feature at the last three Supercomputing (SC) conferences. Last year, Canadian-based Obsidian set up a subsidiary to go after the lucrative U.S. federal, intelligence and defense market spaces.
Network Equipment Technologies (NET) has a competitive product, the NX5010 InfiniBand bridge, a $100K+ box that is already fairly well-established in the U.S. DoD and Intelligence Community market. NET, a provider of a range of telecommunication platforms, got into the long distance InfiniBand market about a year and a half ago when its government customers started demanding long haul InfiniBand capability. Many of these federal organizations maintain a network of HPC sites dispersed across the country. These customers have developed a need to use wide area clusters to run some of their most critical MPI-based programs. Although the three-letter agencies don't talk about specific applications, wide area InfiniBand is a good fit for things such as dispersed intelligence gathering, network centric warfare, and general data mining.
NET's current InfiniBand offering, the 2U NX5010 box, works with any standard IB protocol. To the subnet manager, the NX5010 looks like a two-port InfiniBand switch. The device acts as a network bridge, converting the InfiniBand stream to the subnet manager's WAN protocol -- ATM, 10 Gigabit Ethernet, or whatever. At the other end, the companion NX5010 box attached to the remote cluster or SAN reverses the conversion. The magic is that the translation to and from the subnet protocol is performed at the 10 Gbps line rate, without losing the InfiniBand semantics or incurring a big latency penalty.
NET says they've sold about 100 NX systems so far. That's hardly a commodity market, but the company now thinks it can drive its solution into the commercial space. As InfiniBand adoption grows beyond HPC, NET is eyeing the demand for real time data capture on remote InfiniBand-equipped storage area networks. The company is looking at the financial market, where there is a real demand to synchronize streaming data in real time across storage silos. In particular, for these institutions, the need for remote disaster recovery (DR) may turn out to be the first killer app for long distance InfiniBand.
In the dot-com days, a number of financial firms on Wall Street bought a lot of dark fiber, which is still underutilized. NET is pitching them the idea of using this capacity for InfiniBand-enabled DR. "They have the bandwidth," says Haseeb Budhani, director of strategic planning for NET. "They just don't have a way to push the data." The traditional TCP/IP solution, which was never intended for high performance data transfer, incurs a heavy latency penalty, especially at longer distances.
NET is looking to piggyback onto deployments from Oracle, SAP, EMC, NetApp and system vendors as a way to enter the commercial market. The recent decision by Colfax International to offer NX 5000 systems alongside its high performance cluster gear is a development NET would like to see repeated with other system integrators and OEMs.
While NET is excited about connecting remote storage over IB, at this point, the company doesn't perceive a big demand for long haul computing over InfiniBand outside the government space. But in that market, the need for speed is unrelenting. NET is planning to introduce NX bridges that support 40 Gbps data rate later this year. These devices will be especially handy if you happen to be connected to a next-generation 40G OC-768 backbone.
But for most organizations, remote computing over high performance networks is still a bit too expensive. While NET expects to drive its NX boxes below $100K at some point, it still makes sense for the average HPC customer to expand their compute capacity on-site. As high performance network infrastructure becomes more commonplace and the InfiniBand ecosystem continues to mature, we may see a more general demand for IB-based wide area networking.
As the only two vendors of WAN InfiniBand gear, Obsidian and NET are in a good position to take advantage of those opportunities. From NET's perspective, Budhani would welcome more players, if only to validate the business opportunity. "More competitors absolutely make a case for the market," he says.
Appro Xtreme-X1 Supercomputer is Intel® Cluster Ready Certified
Appro adopts the Intel Cluster Ready program to help simplify deployment, usage and management of high performance computing clusters to achieve faster and more accurate time-to-results. Learn how.
UPenn adds third state to nanowire storage; and UIUC is named the first CUDA Center of Excellence. John West recaps those stories and more in our weekly wrap-up.
Read More...
Modern civilization is positively drenched in data, some of which needs to be dealt with in real time to be of any value. Businesses, especially in the financial industry, have long recognized this, and have been building custom systems to collect, analyze, and react to information as it is captured. IBM thinks the time is right to generalize these approaches into a new field of computing -- and a new business -- it calls stream computing.
Read More...
Not all supercomputing rides on InfiniBand or proprietary interconnects. For technical applications that decompose neatly into loosely-coupled threads, a big cluster with vanilla Gigabit Ethernet does just fine. The top Ethernet system on the TOP500 list -- at number 58 -- is the new ATLAS cluster at the Max Planck Institute for Gravitational Physics in Germany.
Read More...
Jul 03 | Byte and Switch | The San Diego Supercomputer Center, which provides much of the core storage for the TeraGrid, is overhauling its 28 petabyte storage system to support tremendous data growth. Read more...
Jul 03 | ExtremeTech | Intel exec Pat Gelsinger said he sees the Intel Architecture permeating virtually every segment of computing, as the company's microprocessors expand into more and more cores. Read more...
Jul 03 | Bangkok Post | The latest programmable GPUs are starting to steal application cycles from CPUs. Read more...
Jul 02 | UC San Diego News Center | With the help of resources at the San Diego Supercomputer Center, UCSD scientists have isolated more than two dozen promising compounds from which new “designer drugs” might be developed to combat the avian flu virus. Read more...
Jul 02 | Chip Design Magazine | Dual- and quad-core processors barely scratch the surface of the potential of multi-core systems. Read more...
Jul 03 | | The paper explores some of the performance benefits of Star-P on commodity scalable systems such as IBM's Linux clusters based on multi-core Intel Xeon processors. The results demonstrate substantial performance gains with almost no programmer effort-roughly a 24-fold speed improvement for solving linear matrix equations. An overview of parallel computing with Star-P, a description of the performance test cases and description of IBM cluster configurations used for testing are also addressed.
Apr 17 | | An N-body simulation numerically approximates the evolution of a system of bodies in which each body continuously interacts with every other body, and it arises in many other computational science problems as well.
Jun 05 | | As pressure increases on the upstream seismic processing community to deliver ever-higher levels of productivity and efficiency, a new generation of storage solutions will be required that allow the maximum utilisation of high-performance computing (HPC) Linux cluster resources, together with the minimum of management overhead.
Today, HPC organizations are requiring substantially more floating point performance to solve real-world problems. In this podcast, Ben Bennett, ClearSpeed General Manager, discusses how acceleration technology can improve the overall performance of standard x86-based systems...
Get updates and insights on the High Productivity Computing industry delivered driectly to your inbox.