Mission-Critical Cloud Computing? Check Back in Five Years

By Dennis Barker, GRIDtoday

June 16, 2008

Let’s fast-forward through the question of when enterprise IT will fully, heatedly embrace cloud computing — the answer, according to analysts, is five years — and get right to the better question: Will they ever trust the cloud with their life-or-death applications? Those big, demanding applications that require what some call extreme transaction processing — trading, reservations, electronic payments, etc. — can you run those in the cloud? Do you want to?

GigaSpaces Technologies specializes in helping companies develop distributed, scalable, on-demand systems that can handle big, honking, rapid-fire enterprise and Web applications. And they intend to help companies run those systems in Amazon’s Elastic Compute Cloud.

But before we go to the cloud, a bit of background about GigaSpaces’ approach to delivering scalable applications. GigaSpaces’s flagship product is an application server built from scratch for intense computing. The name, eXtreme Application Platform (XAP), kind of gives that away. The company describes it as middleware for running high-performance, high-reliability applications on grids and other distributed systems. The biggest challenge that XAP is designed to tackle is scalability now.

“It’s not the constant growth in the amount of data, transactions and service requests. It’s the unpredictable peaks and troughs,” says Geva Perry, chief marketing officer at GigaSpaces. “Like when AT&T had to provision all those iPhones suddenly for more people than expected and had systems crashing.” That might also be a business modeling problem, but the point is those peaks can be erratic, and the one-time events will get you every time. (Nobody expects the Spanish Inquisition.)

“You can throw money at the problem and just buy lots of servers to have on hand,” Perry says. “But a lot of companies do that and realize they’ve overprovisioned and have all that stuff sitting idle.” Underprovisioning can have worse consequences.

“What we have is an application platform that allows you to scale cost-effectively and quickly on demand with no changes to your application,” Perry says. “XAP lets you build a high-throughput application so that as demand grows, it can respond, and you don’t have to use any new APIs or make any architecture changes. Developers can write in their usual Java or .Net or whatever.”

“XAP enables your applications to scale linearly, and that’s the only way to scale effectively,” Perry says. “Add 100 servers, then handle 100 times more transactions. But the reality is most middleware products don’t handle things the way we do and end up with bottlenecks in one system or another. Doubling servers doesn’t double throughput.” In more painful accounting terms, we’re talking about diminishing return on hardware investment.

GigaSpaces says it does a few things differently that avoid dreaded latency. All services reside in the same server, eliminating the usual hops between the messaging system, the database, and so on. “With XAP, a transaction’s data, business logic, and messaging are all completing in the same place,” Perry says. In the GigaSpaces universe, applications travel as self-sufficient “processing units.” When it’s time to scale up to meet demand, you add more processing units. “It’s simple scaling. One click,” Perry says. XAP also executes every transaction in local memory, avoiding a trip to the database server (transactions are archived there later).

GigaSpaces says XAP’s scalability and performance features meet the needs of large-scale applications, including SaaS, financial services, e-commerce, online reservations, telecomm provisioning and gaming, and that it has customers in all those areas. Financial services company Susquehanna International Group, for example, built its distributed trading platform, which relies on multiple low-latency applications, on top of XAP.

Persistent Clouds

With success in on-premise grids and clusters, GigaSpaces is making a more overt push to bring extreme applications to the cloud by offering XAP for use with Amazon’s EC2. The company plans an official announcement for June 25. “Our cloud offering has been in stealth mode, sort of, but people have been coming to us to discuss it,” Perry says. “There are about 14 companies in the pipeline to use XAP with Amazon Web Services.”

Deploying extreme-style apps on EC2 is not trivial. But GigaSpaces says using XAP simplifies building an application for that environment. “It solves the problem of how do you build a powerful app for the cloud,” Perry says. “Our Amazon offering is truly an application server that can grow and shrink in the cloud on demand.” And because the system supports transparent scaling between in-house servers and EC2, applications can run locally and then switch to the cloud for peak loads.

GigaSpaces provides an Amazon Machine Image configured with installation and scripts to run an entire transaction or computation within a single AMI. “You write your application once, deploy it to the number of nodes you need, and scale up by launching additional AMIs, or scale down by killing those instances,” Perry says. GigaSpaces supports all Amazon machine sizes, and “we charge for the software piece by the hour: 20 cents for a small machine, 80 cents for large, and $1.60 for extra large.”

Anyone who’s been around proceeds with the wisdom that everything fails eventually, yet news of Amazon service disruptions, like EC2’s last October, travels fast and lingers. So reliability is one of the first things GigaSpaces talks about when describing what it brings to the cloud. Besides the scalability and throughput that comes with implementing XAP, company officials say its technology adds a layer of failsafe insurance to EC2.

“People naturally wonder ‘What happens to my transaction if EC2 or S3 fails?'” says Dekel Tankel, director of technical alliances at GigaSpaces. “We have built in a reliable data grid so you don’t have to worry if Amazon fails or if specific nodes fail.” Each node, or AMI, has a “hot” synchronous backup running on a separate AMI, and if one fails, “the application instantly fails-over, at in-memory speed, to the backup AMI,” Tankel says. “Once the failed AMI is resumed, GigaSpaces automatically ‘heals’ the cluster and provides another backup.”

Interest in XAP for AWS has come from large and small companies who need a scalable application server that won’t require a big investment of time and money, according to Tankel. “Even if they are not ready to run mission-critical applications on EC2, they need a lot of resources to test their new applications. It’s not easy to get 100 servers in an organization for a testing cycle. With our software and EC2, they can write the application using XAP, then take a hundred servers from Amazon for a few hours, then let them go when done.”

This “first step into the cloud,” Tankel says, will prove to large IT organizations that compute-intense applications can be scaled on demand reliably and cost-effectively in this new environment. “They will still have their concerns, especially about security, but these concerns tend to come and go or be technically resolved. There’s ultimately no reason to think your own datacenter is more reliable” than a properly formed cloud.

“We really have a chance now as an industry to make a tremendous shift in how applications are built,” Perry says. “Too many are still being built for a specific platform or require certain components. With cloud computing and middleware that provides complete abstraction, we can make applications that are truly portable. You should be able to move applications from your own datacenter to the Amazon cloud to another cloud, apps that are movable without recoding. That is one of our goals with XAP.”

“In many cases, the economics of the cloud are compelling enough that it’s inevitable the industry will move to it,” Perry says. “Companies are spending millions on their own datacenters … and they end up having to be in the IT business when they don’t want to be. It makes sense to go to someone with the expertise and the capacity.”

Extreme transaction processing could eventually be commonplace in the cloud, once certain problems are solved, says analyst Massimo Pezzini, a Milan-based vice president at the Gartner research firm who specializes in application development and integration. “GigaSpaces is one of the companies working on those problems. From what I can see, this is one of the first manifestations of an opportunity for customers to take advantage of the cloud to run transactional workloads. Today, most cloud applications are not very demanding in terms of performance, but GigaSpaces could allow people to deploy large and demanding applications in the cloud.” Benefits of that include lower processing costs, huge savings due to no hardware investments, convenience, more innovation in terms of software as a service, and the opportunity to grow new transaction-intensive businesses, Pezzini says.

Bringing the XAP technology to the cloud is definitely a step in the right direction, he says. “But will Bank of America be moving its banking systems into the cloud? Probably not anytime soon. Even when the technical issues and issues of backup and security are resolved, there is still the question of trust. Do you trust storing your critical data in the cloud? You really need to trust your cloud provider.”

For those businesses that depend on transactions at extreme speeds, forging that kind of trust relationship, says Pezzini, “takes about five years.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire