Out of the Fire, into the PAN

By By Dennis Barker, GRIDtoday

April 21, 2008

Simplify it, and lower the cost. That design change request, first uttered to the guy who invented the wheel, has been passed along to every datacenter manager on the planet.

For the past several years, IT innovators have gone about building solutions for this decree by developing server virtualization software. Egenera has been one of those virtualization pioneers, but the company didn’t stop at the server. Egenera also virtualizes the I/O infrastructure, then adds some heavy-duty management software to meet the request for simplification.

This vast software-ification is intended to maximize the benefits of that simplification. For example, users should spend less money on server and networking hardware; spend less time cabling and troubleshooting that hardware; keep systems running constantly and efficiently; recover faster from disaster; and be able to respond immediately to changing computing needs.

Egenera’s approach to redesigning the datacenter starts with its concept of the processing area network (PAN). The idea behind the PAN is to “unlock the resources from the server, from the hardware,” says Egenera CTO Peter Manca. Just as the storage area network (SAN) virtualizes storage arrays, the PAN virtualizes processing and memory resources. Networking I/O components (Ethernet cards, switches, etc.) also are virtualized. By turning servers and the glue that connects them into software, Egenera’s approach gives customers “a pool of stateless processing nodes.” A PAN is essentially a big pool of these diskless nodes connected via a high-speed fabric.

Servers in the PAN are not your traditional boxes with a specific hardware configuration. They are software assets that users can allocate to applications as needed, all under software control. These virtual servers (pServers) are mapped to physical blades. Virtual switches handle the communications between pServers and external networks. Egenera describes vSwitches as the virtual equivalent of an unmanaged Layer 2 Ethernet switch. Each pServer in a PAN can have as many as 32 of these switches, for a total of 4,096 virtual switches in a single PAN. Thus, Egenera says, a PAN has the capacity to handle clustering and network-intensive applications. The PAN architecture’s support for thousands of internal switches also allows for big consolidation gains in terms of I/O hardware.

These virtual components yield particular advantages for the person in charge of network availability: You can add or reconfigure connections on running servers without rebooting, and you can switch a server to another network without interrupting the services running on that server.

As a result of this freedom from hardware, Manca says, processing and networking resources can be configured and put to work in minutes instead of days, without the headache of moving or rewiring components. “Wire once, and rewire through software,” he says.

“By virtualizing I/O, we’ve made the server stateless, given the server sort of a blank identity. You can run Linux one minute, Windows one minute, Solaris the next,” Manca says. “You can quickly give any server whatever personality you need it to have. You’ve got the ability to use any blade at any time. When you virtualize the infrastructure, people don’t have to buy NICs or HBAs. They don’t have to buy dedicated hardware for their Oracle databases, for example. Fewer servers, fewer licenses. Then there are the soft costs and operating savings. The concept of stateless computing just has so many benefits.”

Master of the PAN

The big (virtual) brain behind the processing area network is PAN Manager, which distinguishes Egenera’s approach and embodies the company’s philosophy of datacenter virtualization.  Manca describes it as “a combination of I/O virtualization technology and strong management software.” PAN Manager replaces server and network hardware with software, and then throws in a considerable list of crucial datacenter capabilities.

Egenera says PAN Manager’s virtualization of I/O effectively reduces the amount of time IT staff spends adding, deleting, or manipulating servers. Since the server and all the required parts are software, provisioning a new server, or replacing one that failed, can be done from the PAN Manager console quickly. And because any blade in the system can serve as a backup for any other, high availability is maintained.

High availability is one of the main benefits Egenera mentions when discussing PAN architecture. When so many vital components are in software, you get what Egenera calls server and PAN portability. Each pServer’s image and entire PAN configurations can be shifted between processing arrays. Defining or replicating a new blade can all be done from the PAN Manager console. The PAN approach provides real N+1 failover, Egenera says, and that backup measure can be accomplished in three ways: global failover, where every blade is accessible to any pServer in the network; local failover, where a blade is accessible only to pServers within the local PAN; or dedicated failover, where a processing blade is committed to a single pServer. When a pServer fails, PAN Manager automatically maps all disk, network and switch configurations to a spare blade, which boots up with the same characteristics as the blade that failed.

Simpler disaster recovery is another of the benefits of Egenera’s iron-into-code technology. “Because the PAN is stateless, all production information is kept in an XML file, so you can clone your production system in a flash,” Manca says. IT departments wanting to do some test and development could load a new XML file into a server in about 5 minutes, he claims.

The PAN also provides built-in, automated load balancing with clusters of pServers running instances of the same application. The balancer sends requests to pServers on the basis of user-adjustable policies. PAN Manager incorporates software to monitor the health of active pServers. There also is a module that watches for application failure and, in the event of it, moves the application according to user policy.

Control All Machines

In 2006, PAN Manager took on a whole new set of powers when Egenera introduced its vBlade software. This extension adds hypervisor technology to PAN Manager, so the program can manage both real machines and virtual machines. The IT guy can now control physical and virtual servers from the same pane of glass, as they say. “Compare this to running a non-PAN rack of servers,” explained Manca. “You’d have to go into the hardware vendor’s software to set up that product, then get into VMware Virtual Center to configure and manage your virtual machines. In PAN, you’ve got one management console that understands hardware and virtual machines.” A server setup can be assigned to an actual physical blade or to a virtual blade partition, giving IT staff not just flexibility but an easy way to take advantage of it.

Deleting layers of complexity from datacenter management is one of Egenera’s major objectives with PAN Manager. As a result, the idea goes, IT staff spend less time shuffling resources or meeting to discuss shuffling those resources. Businesses can react almost instantly to a demand for more processing power — or less.  “PAN Manager gives you pools of servers that can be contracted or expanded as your application needs,” Manca says. “You can do it manually or on the fly. You could have a trigger that says ‘If CPU usage goes above 80 percent …’ or walk up to the console and pull a blade as you need it. If you’re over-provisioned, you can bring it back down. You can very quickly take an application stack and load it on the right-size blade and free up that larger blade for a more demanding app. PAN Manager gives the user the flexibility to manipulate the hardware to their needs.”

PAN Manager originally was tied to Egenera’s BladeFrame hardware, but the company has decoupled its star software and is letting it go to other servers. Egenera has announced OEM deals with Fujitsu Siemens and, most recently, with Dell to offer PAN Manager on their hardware. Forrester Research analyst James Staten says the Dell endorsement gives PAN Manager “instant credibility.”

Users Banking on Egenera

Egenera has been around since about the turn of the century, and in that time has delivered BladeFrame and PAN Manager to companies across a broad range of businesses. “Because we run Red Hat, SUSE, Windows [and] Solaris x86, in addition to VMware and XenSource, the apps we support tend to be broad,” Manca says. “We get a lot of traction with medium enterprise companies. They don’t have large IT staffs, so they want something they can plug in and run quickly.”

One of its first customers was Credit Suisse, which uses BladeFrames for online trading activity. Military Health Systems, which operates more than 100 VA hospitals, uses BladeFrames to process all its data. Another financial institution, Standard Chartered PLC, recently selected Egenera’s technology for its core banking application used by 1,200 banks worldwide. In a case study issued by Egenera, Standard Chartered says it chose Egenera because it was “the only vendor with an integrated solution for virtualization and the I/O fabric.” Standard Charter says it can now bring a new country online in nine days rather than the 45 it takes with a legacy architecture.

SAVVIS, a large managed service provider with 24 datacenters, has seen a similar speed-up in meeting its customers’ requirements, Manca says. For companies like this that provide scalable hosting, he says, “we are the ultimate utility computer. They can run any application stack on any server at any time. You can change the underlying hardware and virtual machine at any time.” As a result, SAVVIS can “spin up a new customer in minutes, whereas it used to take 30 days,” he says. “Now they can start generating that revenue immediately.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

2024 Winter Classic: Texas Two Step

April 18, 2024

Texas Tech University. Their middle name is ‘tech’, so it’s no surprise that they’ve been fielding not one, but two teams in the last three Winter Classic cluster competitions. Their teams, dubbed Matador and Red Read more…

2024 Winter Classic: The Return of Team Fayetteville

April 18, 2024

Hailing from Fayetteville, NC, Fayetteville State University stayed under the radar in their first Winter Classic competition in 2022. Solid students for sure, but not a lot of HPC experience. All good. They didn’t Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use of Rigetti’s Novera 9-qubit QPU. The approach by a quantum Read more…

2024 Winter Classic: Meet Team Morehouse

April 17, 2024

Morehouse College? The university is well-known for their long list of illustrious graduates, the rigor of their academics, and the quality of the instruction. They were one of the first schools to sign up for the Winter Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pressing needs and hurdles to widespread AI adoption. The sudde Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire