Oracle’s XTP Sage Shares his Wisdom

By Nicole Hemsoth

April 7, 2008

In this interview, Cameron Purdy, Oracle’s vice president of Fusion Middleware development, discusses how customer demands for extreme transaction processing are evolving, as well as how use cases for Coherence (the data grid solution Purdy developed as founder of Tangosol) have evolved since becoming part of the Oracle software family.

— 

GRIDtoday: What industries are using Coherence, and what industries are taking advantage of XTP, in general?

CAMERON PURDY: A few key industries are probably our No. 1 driving factor both in terms of technology and, in a lot of cases, the revenue associated with the product. By a statistically significant margin, the No. 1 is still financial services. Financial services, from an XTP point of view, drove our initial move into the grid environment and continues to drive a substantial portion, and the extreme portion, of the XTP vision in terms of the features on which we focus. Those features turn out to be universally appropriate in every industry we’re working in, but because of the competitive nature of [the financial services] industry, coupled with the grid brains that have been vacuumed into that industry for their ability to solve some of these problems, financial services has remained the biggest adopter of XTP to date. But it’s certainly not the only one.

Online systems are another huge driver. So online retailers, online travel systems and online gaming companies — the other side of the financial market, the ones who bet on horses instead of stocks — traditionally have been and continue to be a pretty big chunk of market that we serve.

We continue to work with a growing list of logistics companies, and we’ve been very successful in that market. One of our customers in that market mentioned to us that they are an IT organization that happens to ship packages. Commensurate with being in that market there is a huge volume of information — rapidly changing information — and lots of ways to make small optimizations have a big impact on the bottom line. That certainly is one of the use cases for which our product has been very popular — the ability to draw significant conclusions from massive heaps of information. We’ve been very successfully adopted by telcos, as well, and utilities for the same reasons: large amounts of information, requirements for uptime, systemic availability.

I’d say over the last year, and probably as a side effect of being part of the Oracle organization, a lot of the adoption we’re seeing has been associated with large companies and organizations and their uptake of service-oriented architectures. As that first wave of SOA has taken hold of the industry, it has certainly propelled Coherence in terms of the requirement for creating systemic availability, for creating continuous availability of systems, and being able to scale out critical services.

When you think of infrastructure, you don’t necessarily think of SOA as being a driving factor behind extreme transaction processing, but if you think about it, if SOA actually takes off within an organization, you’re going to have a relatively small number of services that end up far exceeding their original dedicated footprints. When something is popular in an operating world, the same Slashdot effect that we saw on the Web 10 years ago applies to services in a service-oriented architecture. As you start to expose valuable information within an organization, as soon as becomes available, everyone with an Excel spreadsheet, anyone with Javascript capabilities, let alone your IT organization using .NET, Java or anything back to COBOL, now has the ability to grab that information from you and submit transactions to you. The end result is SOA generates a requirement for additional availability, but it also generates incredible hotspots in terms of infrastructure — the requirement to be able to scale out systems to the levels that we describe when we talk about XTP.

Gt: How are you seeing requirements evolve within these industries?

PURDY: I think what we saw happening a year or two ago has definitely broadened, in terms of adoption, as well as deepened, in terms of requirements. A lot of these systems are mission-critical systems. They’re not science fair projects or academic projects; these are systems driving core infrastructure, particularly in financial institutions, in terms of being able to shrink overnight windows on risk calculations and things like that. We see a shift from what we used to think of as “exotic” use for these types of systems to — at least in those types of environments — mainstream use. I think a lot of it certainly has been driven by better understanding of the technology. Obviously, Gartner and Forrester and other analyst groups have been pretty instrumental in popularizing some of the notions that they witnessed in customer accounts of ours. Thus, these are things that have gone from being exotic to being pretty much mainstream — at least within markets like telcos, financial institutions and high-scale e-commerce systems.

Gt: Are there any industries in particular where you’ve seen a dramatic change in either demand for or use of these types of systems?

PURDY: Certainly, online companies are going to be a poster child for it. In particular, that has to do with the fact that their growth is capable of exceeding any expectation. In other words, even if they don’t require the level of scalability that we can provide, they’re not sure if they require it. So, quite often, we’re seeing architectures adopting Coherence very early on, making sure they have XTP built into their core as a means of insulating them from cost surprise down the line.

When you can achieve, for vast portions of your system, linear scalability on commodity hardware — we’re not talking about million-dollar machines, we’re talking about $2,000 to $5,000 machines and being able to scale out into the thousands of these — you end up having cost predictability. You’re not surprised when your system gets so loaded down that you have to buy more hardware because you know how much hardware it’s actually going to take to increase your throughput. Basically, it gives you assurance that you’re going to be able to meet your SLAs or meet the perceived requirements of your users at any level of scale. It gives you a type of insurance that is priceless for a CIO or an architect of a system, as it gives them the ability early on in the project to address issues that would otherwise be crippling, if not, from the point of view of a start-up, life-ending.

We’re seeing grid technology adopted much more early-stage now than early on, when many of these systems were only coming to us when they hit the wall. We certainly gained a reputation for helping companies that had hit the wall, but it’s much more gratifying to see companies adopting it as a core part of what they’re doing.

Gt: If Coherence handles the data aspect of an XTP environment, what are your customers doing to address the compute aspect?

PURDY: I think companies already have the compute requirement. We’re talking about companies using DataSynapse or Platform, for example, or, more recently, using some of the open-source offerings such as GridGain. These are companies that, by and large, have traditional compute-intensive loads that they already deployed on those environments. So we’re not seeing so much these companies moving to compute after they have data grid; it’s more that these compute loads they have are data-intensive, and for a number of those types of applications, the ability to scale out a compute grid without having a data grid stitched into it just can’t happen. They’ll get to a point where it doesn’t matter how many compute nodes they add, they’re not going to get anything done because the information is bottlenecked.

This is a lot of what drove us into the grid space to begin with. In some of earliest grid projects at some of the banks here in the states, they were already doing the compute side, but they needed the data to keep up with the compute. The data grid is a very natural fit with high-scale compute infrastructures, and we continue to invest heavily in that area. Our customers see that infrastructure not as a compute infrastructure or as a data infrastructure, but as a utility. It’s an investment that they’ve made that allows them to deploy large-scale applications — applications that at certain times of day might consume hundreds of thousands of CPUs in parallel, and at other times might not need anything — out into a utility environment.

Gt: Overall, how has customer use of Coherence evolved over the past six months to a year?

PURDY: One of things about going from being fairly exotic to being sort of mainstream is that a lot of the considerations our customers had when we were working with them a few years ago are not the same consideration they have today. Our customers have certainly focused much more on the manageability and the monitoring, what we refer to internally as “Project iPod” — this idea that just because you’re controlling 10,000 CPUs, it doesn’t mean it has to be as complex as configuring 10,000 servers. For many of these applications, it should be as simple as pressing the “play” button. If it needs 10,000 CPUs to do that, it should be able to allocate, deploy, start up, configure, etc., everything it needs to do across extreme-scale environments. And just as easily, when it’s done with what it has to do, if it’s no longer needed, it should be able to fold itself back up and put itself away.

Our customers are no longer 100 percent rocket scientists, they’re not all eating and drinking the technology at that level anymore, so our software continues to evolve to be more and more IT-friendly. It focuses on best practices and documentation, and on what years ago in the mainframe parlance we referred to as “serviceability.” That is, the ability to have your software not only be configured and rolled out, but actually be morphable as it runs, to be able to be serviced, upgraded and hot deployed. All of these are critical to our customers.

In addition, because we’re part of Oracle, the integration with the database has been more and more a desired outcome for many of our customers, so that obviously is an area in which we’ve significantly increased our investment. Also, as I mentioned earlier, I think the investment Oracle has made in service-oriented architecture has influenced a lot of the use cases we’re seeing. From the big shift point of view, what we’re seeing is: (1) a shift of more business users to the grid; (2) more integration required across the database and into the data grid; and (3) the integration with service-oriented technology.

Gt: How has being part of Oracle, and Oracle Fusion middleware specifically, affected what your customers expect?

PURDY: Fortunately, we’ve been able to keep the entire customer base through the transaction and through the subsequent time period, so we’ve been able to keep those communications open with our customers. We’ve obviously significantly expanded the customer base by being part of a very large organization, and we continue to exceed all expectations on goals that were set forth there.

The end result though is that the requests we get from customers continue to increase as a result of this broadening of the customers we’re working with. The benefits to our organization are a much clearer picture of what the market is driving toward and, ultimately, what the needs are of the companies that are adopting this technology. As much as out customers look to us as visionaries, I think the flip of that is also true: the technology we create is in direct correlation to the needs our customers invest in us, in terms of what they share with us about the problems they’re attempting to solve. It’s a trust relationship, but it works both directions.

Also, being part of Oracle, we’ve been able to scale our organization in terms of sales, development, marketing, and the level and quality of service that we are able to provide to our customers. From all those vantage points, it’s been an overwhelming success.

Gt: You noted that you’re being asked more often to integrate Coherence into existing Oracle database environments? How do you handle this?

PURDY: Traditionally, we’ve had a number of ways to integrate with the database, including asynchronously through “write-behind” technology, which I think is a pretty big game-changer from an XTP point of view. The ability to do reliable write-behind, which I think we introduced six years ago, is one of the technologies that propelled us well beyond the noise in the market. Also, in terms of read and write coalescing and things like that, we’ve been able to dramatically increase the effectiveness of Oracle databases in large-scale compute grid environments and extreme-scale e-commerce systems.

Additionally, Oracle has invested in and published materials on how the database can provide information in real time out to, in terms of event feeds, for example, clients in the database. We’re working hand-in-hand with that organization to be able to take advantage of that. We’re not talking about any top secret, backdoor, hidden stuff. These are all published interfaces and, as much as possible, standards-based approaches to integrations with the Oracle database. From our point of view, it’s a good investment for us, and it’s very cost-effective for our customers, as well.

Gt: If you could narrow it down, what is the ideal use case for Coherence, or for any data grid? What’s the perfect job?

PURDY: There are so many examples of a perfect fit that it’s hard to boil it down to one example. The answer I’ve given to customers when they ask that type of question is “How many applications do you have that you want to have available all the time? How many applications do you want to be able to scale up? How many applications do you have where performance matters? In how many of these applications would you like to have real-time information, information consistency and information reliability?”

What Coherence provides out of the box is not a solution to one of these problems, it’s not a “here’s how you make your app faster” or “here’s how you make your app scale.” Anyone can make an application faster, anyone can build a system that’s 99.999 percent available. We have known solutions for any of these problems in isolation. What’s difficult isn’t solving for one of them, it’s solving for all of those variables at the same time.

What Coherence provides to our customers is a solution to availability, to reliability of information, to linear scalability and to predictable high performance, and it solves those simultaneously. It provides trusted infrastructure for building that next generation of grid-enabled out-of-the-box applications. This has been our differentiator in the market, this trusted status of truly being able to provide information reliability within large-scale, mission-critical environments, and it certainly is one of the reasons this acquisition has worked so well. We have fundamentally invested in those core tenets of these systems, and our customers understand and respect that. 

 
Gt: How do you view the future of the data grid market, specifically around what business needs are going to be driving the next round of technological advancements?

PURDY: One of the greatest things about our industry is the insatiable desire for “more, better, faster.” When you look at the increase in data volumes — whether you’re looking at the doubling of financial feed information every six to nine months or you’re looking at the numbers provided by storage vendors in terms of how much information their customers are managing on an ongoing basis — there seems to be no end to demand for the ability to manage, analyze, produce and calculate. It’s fair to say that our customers astound us with their appetites for being able to [manage] just huge systems.

What we see overall is a move toward consolidating many of the capabilities that we associate today with grid, virtualization and service-oriented architectures, as well as the manageability of all of those, and turning these point solutions to problems of utility computing, capacity on demand, SLA management and dynamic infrastructure. I think our goal as an industry has been to move from having grid, for example, be considered an exotic concept, to having it considered almost a de facto standard for how all applications should be built and deployed. Because we’re not talking about exotic concepts, we’re talking about things everybody wants. When you the last time you talked to a customer that didn’t care about scalability or didn’t care about systemic availability? These are things that all of our customers are faced with as challenges, and being able to provide a systemic solution to those challenges is, from a customer point of view, just a natural evolution of what our industry has been providing.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire