Cloud Computing, Virtualization 2.0 Among NGDC Highlights

By Derrick Harris

August 11, 2008

Did anyone actually think a conference called Next Generation Data Center (NGDC) would come and go without addressing “the cloud?” In 2008 — a year destined to go down in the IT annals as the “Year of the Cloud” — that’s not even a possibility. However, cloud computing wasn’t the only topic discussed at the show, and even when it was the paradigm du jour, its presentation ranged from “this is what it is” to “this is how it looks” to “this is how we’re using it — today.” (And I didn’t even attend all of the sessions dedicated to cloud computing.)

The whole NGDC/LinuxWorld show (held last week in San Francisco) kicked off with a keynote by Merrill Lynch Chief Technology Architect Jeffrey Birnbaum, who outlined the investment bank’s move to “stateless computing.” Actually, he explained, it’s not so much about being stateless as it is about where the state is. Merrill Lynch is moving from a dedicated server network to a shared server network, functioning essentially as a cloud that allows Merrill Lynch to provision capacity rather than machines.

Aside from architectural change, Birnbaum says another key element of Merrill Lynch’s stateless infrastructure is its enterprise file system, which he believes really should be called an “application deployment system.” A namespace environment like the Web, all the components needed for an application to run are referenceable through the file system, thus negating the need for heavy-duty software stacks and golden images. The file system works via a combination of push and pull, or of replication and caching, said Birnbaum. The strategy also works for virtual desktops, he said, with all applications — including the operating system — being stream to the thin client.

But keeping things lightweight and flexible is only part of the challenge; workload management also is important. Birnbaum says widespread virtualization is a key to this type of infrastructure, but some applications can’t handle performance overhead imposed by running in a virtual environment. For these types of applications, a stateless computing platform needs the ability to host applications either physically or virtually. Additionally, says Birnbaum, everything has to be policy-based so primary applications get their resources when they need them. On the workload management front, Merrill Lynch is working with Evergrid, Platform Computing and SoftModule.

For the folks concerned about capital expenditure, the best part about Merrill Lynch’s stateless vision is that it can be done on mostly (if not entirely) commodity hardware. Because the state is in the architecture instead of an individual machine, Birnbaum says you can buy cheaper, less redundant and less specialized hardware, ditching failed machines and putting the work elsewhere without worry.

One of the big business benefits of stateless computing at Merrill Lynch is that it lets the financial services leader maximize utilization of existing resources. If someone needs 2,000 servers for an exotic derivatives grid and the company is only at 31 percent utilization, it has that spare capacity and doesn’t have to buy those additional servers, Birnbaum explained. Offering some insight into the financial mindset, Birnbaum added that Merrill Lynch buys new servers when it reaches 80 percent utilization, therefore ensuring a capacity cushion in case there is a spike.

Speaking less about a real-world internal cloud deployment and more about the building blocks of cloud computing was Appistry‘s Sam Charrington. One of his key takeaways was that while virtualization is among cloud computing’s driving technologies, a bunch of VMs does not equal a cloud. It’s great to be able to pull resources or machines from the air, Charrington explained, but the platform needs to know how to do it automatically.

Beyond getting comfortable with underlying technologies and paradigms like virtualization and SOA, Charrington also advised would-be cloud users to get familiar with public clouds like Amazon EC2, GoGrid and Google App Engine; inventory their applications to see what will work well in the cloud; and to get a small team together to plan for and figure out the migration.

Looking forward, Charrington says the cloud landscape will consist not only of the oft-discussed public clouds like EC2, but also will include virtual private clouds for specific types of applications/industries (like a HIPAA cloud for the medical field) and private, inside-the-firewall clouds. Citing The 451 Group’s Rachel Chalmers, Charrington said the best CIOs will be the ones who can best place applications within and leverage this variety of cloud options.

The cloud also was the focus of grid computing veteran Ravi Subramaniam, principal engineer in the Digital Enterprise Group at Intel. Subramaniam led his presentation by noting that cloud computing is not “computing in the clouds,” mainly because whether it is done externally or internally, cloud computing is inherently organized, and users know the provider — be it Amazon, Google or your own IT department. Illustrating a sort of cloud version of Newton’s third law, Subramaniam pointed out that for every one of cloud computing’s cons, there is an equally compelling pro: security issues exist, but CAPEX and OPEX savings can be drastic; end-users might have limited control of the resources, but those resources are simple to use by design; and so on.

Subramaniam focused a good portion of his talk on the relationship between grid computing and cloud computing, positing that the two aren’t as different as many believe. However, he noted, coming to this conclusion requires viewing grid as a broad, service-oriented solution rather than something narrow and application-specific. In their ideal form, he explained, grids are about managing workloads and infrastructure in the same framework, as well as about matching workloads to resources and vice versa.

For all of its strengths, though, grid computing does have its weaknesses, among which Subramaniam cites a lack of straightforwardness in applying and limited usefulness in small-scale environments. Cloud computing attempts to simplify grid from the user level, he said, which means utilizing a uniform application model, using the Web for access, using virtualization to mask complexity and using a “declarative” paradigm to simplify interaction. Essentially, Subramaniam summated, the cloud is where grid wanted to go.

If users approach both cloud computing and grid computing with an open mind and applying broad definitions, they will see that the synergies between the two paradigms are quite strong. The combination of grid and cloud technologies, Subramaniam says, means virtualization, aggregation and partitioning as needed, a pool of resources that can flex and adapt as needed, and even the ability to leverage external clouds to augment existing resources.

Virtualization 2.0

Of course, cloud computing wasn’t the only topic being discussed at NGDC, and one of particular interest to me was the concept of “virtualization 2.0.” In a discussion moderated by analyst Dan Kuznetsky, the panelists — Greg O’Connor of Trident Systems, Larry Stein of Scalent Systems, Jonah Paransky of StackSafe and Albert Lee of Xkoto — all seemed to agree that Virtualization 2.0 is about moving production jobs into virtual environments, moving beyond the hypervisor and delivering real business solutions to real business problems.

But the real discussion revolved around what is driving advances in virtualization. Xkoto is a provider of database virtualization, and Lee said he has noticed that the first round of virtualization raised expectations around provisioning, failover and consolidation, and now users want more. In the usually-grounded database space, he noted, even DBAs are demanding results like their comrades in other tiers have seen.

Another area where expectations have increased is availability, said StackSafe’s Paransky. While it used to be only transaction-processing systems at big banks that demanded continuous availability, Paransky quipped (although not without an element of truth) that it’s now considered a disaster if e-mail goes down for five minutes — and God forbid Twitter should go down. People just expect their systems and applications will always be available, and they’re expecting virtualization to help them get there.

Lee added that once you jump in, you have to swim, and users want to continue to invest in virtualization technologies.

However, there are inhibitors. Lee contends that adopters of server virtualization solely for the sake of consolidation risk backing themselves into a corner by relying on fewer boxes to run the same number of applications. If one box goes down, he noted, the effect is that much greater.

Fear of change also seems to be inhibiting further virtualization adoption. Scalent’s Stein said companies see the value of virtualization, but getting them to overcome legacy policies around new technology can be difficult. What’s more, he added, is that it’s not as easy as just ripping and replacing — virtualization needs to work with existing datacenters. Paransky echoed this concern, noting that virtualization can mean uncontrolled change, which is especially scary to organizations with solid change management systems.

Also, he noted, Virtualization 1.0 isn’t exactly past-tense, as 70-80 percent of IT dollars are spent on what already is there. Paransky assured the room that although they’re not sexy, people still have mainframes because of this compulsion to improve or keep up existing systems rather than move to new ones.

Moderator Kuznetsky was not oblivious to these obstacles, asking the panel what will drive organizations to actually make the leap to Virtualization 2.0, especially considering the general rule that organizations hate to change anything or adopt new technologies. Xkoto’s Lee commented that the IT world responds to pain, resisting change for the sake of change and holding out until there are real pain points.

Paransky took a more forceful stance, stating the organizations no longer have the luxury to resist change like they did in the past. Customers pay the bills, he says, and they don’t like the turtle-like pace of change — they want dynamism. He noted, however, that organizations don’t hate change because they think it is bad, but rather because it brings risk. The trick is balancing the benefits that virtualization can bring with the needs to keep things up and running.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire