What I Learned at NGDC: Technology is Ready, Users are Not

By Derrick Harris

August 13, 2007

If I had to boil what I observed at last week’s Next Generation Data Center conference into one thought, it would be this: Although technologies to virtualize, optimize and automate datacenters do exist and are mature, organizations are still leery about making the transformation, despite coveting the associated improvements in performance, flexibility and overall efficiency.

As evidence that these technologies are mature, one really need look no further than the event’s opening keynotes, in which Amazon CTO Werner Vogels and eBay distinguished research scientist Paul Strong discussed the Web-scale datacenters being operated by their respective employers. Strong, for his part, delved a little deeper into the nitty gritty details (see here and here for more on this), all the while, however, stressing the importance of building a datacenter that: (1) runs processes driven by SLAs; (2) operates as a value center rather than a cost center; and (3) enables the rolling out of new utilities, platforms, etc. To achieve this next-generation datacenter, he said, many technologies can and should be considered, including (but certainly not limited to) grid computing, utility computing, real-time solutions and virtualization.

Now, it’s unlikely that most organizations have the resources (or the need — said Strong, “If we don’t keep up, our business is gone.”) to develop and/or manage a datacenter like eBay’s — a highly automated and virtualized environment consisting of several thousand blades servers — but that doesn’t mean companies with less demand on their infrastructures can’t learn some lessons from the online auction leader. For starters, said Strong, automation is the key to efficiency and, in point he made sure to drive home, it is important to “manage relationships, not things.” This advice should be particularly relevant as today’s average datacenters continue to evolve toward advanced models like that of eBay. After all, when you’re staring down thousands of physical machines (and likely significantly more virtual ones) you can’t possibly expect to manage each one individually.

As for Vogels, whose presentation kicked off the doubleheader, he targeted his comments toward those companies who aren’t too keen on managing their own resources, and he used the opportunity to push Amazon’s stable of Web services. For most companies, he estimates, 70 percent of time and money expenditures go toward “heavy lifting” operations like maintenance, load balancing or software management, among others, none of which offer much in terms of innovation or helping differentiate your company from competitors. Unless you’re in an industry where having a customized, highly efficient datacenter directly translates into dollars, he suggested, it might be a waste of resources on all levels. “At [Amazon’s] scale, datacenters matter,” he stated. “They don’t for everyone.”

Following his statement that “I hate datacenters,” Vogels cited the recent power outage at a popular San Francisco datacenter — which led to temporary shutdowns of several Web 2.0 leaders, including Craigslist, Second Life and Netflix — as one example of what can go wrong. Power-wise, he added, even if you have generators to ensure you keep running, you still need batteries to handle the gap in time between the power going out and the generators kicking in. Still in the realm of possible physical issues, Vogels noted that datacenter managers still need to worry about issues such as sufficient cooling and how to handle a fire or other disasters. And that’s not even addressing business-side concerns, such as whether one datacenter is enough, or how you’re going to push out enough bandwidth to handle demand. Often times, he said, companies need to overprovision to handle peak loads or in case they become successful.

His solution? Utilize services such as Amazon’s Elastic Compute Cloud (EC2), as well its other Web services offerings, to handle your computing needs, paying only for what you actually use. Vogels talked about this notion as the “push versus pull” model of resource management, where “pushing” refers to the old-style method of preemptively pushing resources toward problems and “pulling” refers to the more progressive concept of pulling in resources from a centralized pool as needed — just like Amazon itself does. For organizations that don’t necessarily benefit from managing their own datacenters, he said, this utility model will allow them to get the computing resources they need elsewhere, thus freeing up time and money to spend on innovation.

However, while Vogels’ solution to datacenter woes might sound logical and easy enough to incorporate, I wouldn’t hold my breath waiting for this idea to gain mainstream acceptance, much less adoption. Why, you ask? Because organizations are having a difficult enough time taking advantage of next-generation tools within their own walls — something far less scary than the notion of relying on someone else to make sure their applications get the attention they deserve. This was made abundantly clear during a presentation by OGF President Mark Linesch, who was simply trying to lay out the business case for and current status of grid computing technologies.

During Linesch’s presentation, an audience member (who actually has some firsthand grid experience under his belt) asked how he is supposed to sell the idea of grid or virtualization to his applications developers, some of whom still oppose running their applications on distributed or virtualized platforms. (In fact, this gentleman just denied a request for 60-plus servers to develop and test a new application, instead preferring the work to be done on existing, virtually partitioned machines.) Linesch, backed up by Ravi Subramaniam, who has plenty of insight to offer after years of managing Intel’s in-house grid, gave really the only answer one can give in this situation, regardless of how frustrating it might be to someone desperately seeking a cost-efficient, dynamic IT platform for his/her organization: You have to start slow, showing success in one area at a time — perhaps with just one application — and illustrate how that translates into other areas. Not exactly the best way to show off a technology’s full range of capabilities, but not exactly uncommon, either.

This point was hammered home when I sat down to speak with Jay Fry and Ken Oestreich of Cassatt, vice president of marketing and director of product management and marketing, respectively. With its capabilities in areas like capacity on-demand, application virtualization, service level automation, utility metering, etc., Cassatt’s Collage software certainly falls under the “next-generation” umbrella, but customers aren’t always ready to experience it in full force from the get-go. In fact, customers have been known to ask for a pared down version of Collage, something Cassatt might have to do in order to show them — one step at a time — that its software is for real.

I was happy to hear, though, that Cassatt is making inroads on another front: the battle to ease customers’ minds about the cultural and organization changes that come along with the technical changes of a shared IT platform. Gone are the siloed applications and their siloed personnel. Gone are the days of server hugging. Gone are the days of low utilization and high overhead. While these all sound like great things, that kind of change apparently can be quite foreboding for IT departments, which is why many are hesitant to cross over into the promised land. Well, Cassatt and consultancy partner BearingPoint have been walking customers through this process, which they believe needs to be done in parallel, in their New York-based customer experience center, and the reaction has been very positive thus far. You can read more about the Cassatt/BearingPoint partnership here, and you can expect to see more about Cassatt’s take on utility computing in the weeks to come.

Speaking of application virtualization and its associated functionalities, the topic came up in a panel discussion featuring three distinct virtualization users and, wouldn’t you know, it seems to be a little much for them at this point. When the topic of “virtualization 2.0” was brought up, which was defined as including the grid-like abilities (e.g., high availability, SLA management, scalability, etc.) often associated with application virtualization solutions, the response was not overly positive. While Brian Harris, president and founder of Virtual Ngenuity, stated that he believes these functionalities are currently driving business decisions, two of his fellow panelists showed that this might not be entirely the case. Richard Robinson, chief operations officer for Department of Telecommunications and Information Services, City and County of San Francisco, commented that while his department is working toward these goals, they are not there yet and might not be for quite a while. After all, he noted, the “if it’s not broken, don’t fix it” axiom carries a lot of weight in the local government sector. Sudip Chahal, a senior architect in Intel’s IT Strategy, Architecture and Innovation organization, espoused his belief that this type of architecture isn’t well-suited for traditional business applications — a view not shared by audience member Dave Pearson of Oracle or, I would assume, most of the distributed IT community.

Of course, there was more going on at LinuxWorld/NGDC than just vendors showing off to and discussing with skeptical end-users their cutting-edge technologies. For example, some cool news also came out of the show, such as Appistry adding power-saving functionality to its Enterprise Application Fabric; ServePath announcing high-performance hosting via its virtualized GoGrid service; EnterpriseDB challenging Oracle with GridSQL; and IBM tackling your mountains of widely dispersed data with its grid- and virtualization-powered Information Server Blade, which Big Blue says has demonstrated significant improvements in batch process performance, hardware price performance and budget expenditure. If you’re still hungry for more after reading these announcements, don’t fret, as we’ll have more on all of them — as well as a look at the growing grid hosting business — in the weeks and months to come.

Outside of NGDC news, be sure to check these very noteworthy items: “GigaSpaces Powers Sun’s Market Data Solution”; “NCAR Adds Resources to TeraGrid”; “Imense Using Grid to Become ‘Google of Image Searching’”; “Trigence Intros Optimized App Virtualization Software”; “Sun Releases Fastest Commodity Microprocessor”; and “Layered Tech Announces Super Grid.”

—–

Comments about GRIDtoday are welcomed and encouraged. Write to me, Derrick Harris, at [email protected].

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire