The Eye of the Cyclone: Christian Tanasescu on SGI’s Role in Technical Cloud Computing

By Dr. Wolfgang Gentzsch

March 7, 2011

One year ago, SGI announced its SGI Cyclone for large-scale, on-demand cloud computing services specifically dedicated to technical applications. Around this first anniversary, it seemed like the perfect time to get an update from someone who is deeply involved with Cyclone, Christian Tanasescu. As Vice President Software Engineering at SGI. Christian, among others, leads SGI’s activities around Cyclone. He was an easy catch for me because we know each other well from the good old Fortran 90 days!

Just for some background, in his VP of Software Engineering role, Christian is responsible for system software and middleware development, applications, ISV relationships and leadership of all activities around Cyclone, one of the first cloud offerings on the market dedicated to HPC applications. Since joining SGI in 1992, Christian has held a number of management positions in HPC system engineering, strategic partner management, performance modeling and application enablement. As initiator of the Top20Auto study to analyze development of HPC platforms and applications in the automotive industry, Christian has extensive knowledge of the manufacturing vertical. Prior to SGI, Christian worked at Fujitsu-Siemens in compiler development and served on the Fortran 90 standardization committee. Christian holds a master’s degree in Computer Science from Polytechnic University of Bucharest.

Wolfgang:  Christian, let’s start with the state of affairs at SGI and its current focus in the marketplace.

Christian: SGI is focused on the technical computing market, which addresses the ‘big data’ needs of both mission critical technical and business applications. Technical computing problems in science, engineering and business are addressed by compute-intensive and data-intensive applications. Compute-intensive workloads are model-based computations, where every single data element is important, the basic method is hypothesis testing, the model can be deconstructed, and runs well on clusters. Data-intensive workloads tend to be model-free computations, difficult or impossible to deconstruct, the basic method of which is pattern discovery, and run well in shared memory. Our goal is to accelerate time to results for customers in our target markets, which include: Internet and Cloud, Government, Research and Education, Manufacturing, Energy, and Financial Services.

Wolfgang:  Please, tell us about SGI Cyclone.

Christian: SGI Cyclone cloud computing service is one of the first specifically dedicated to scientific and engineering applications.  When SGI began to design Cyclone we were razor focused on offering our customers the applications that they are currently using to create their products or do cutting edge research – in other words, applications that “are their business,” as opposed to applications that “support their business.”  Email, CRM and HR programs are important support functions for any company, but technical applications are those that result in the design of a better, safer and quieter car or airplane, help discover new drugs or new oil reserves, or offer better forecasts of the weather, to name just a few examples. 

Through Cyclone, SGI offers its performance-optimized software stack and hardware together with key technical computing applications from its partners or open source in the domains of Computational Biology, Computational Chemistry and Materials, Computational Fluid Dynamics, Finite Element Analysis, Computational Electromagnetics and Data Analytics. 

A prominent feature in Cyclone is the flexibility of choices, because technical workloads have very different computational requirements. On Cyclone, customers have a choice of platforms (scale-up or scale-out), accelerators (NVIDIA Fermi, ATI FireStream and Tilera), operating systems (SUSE, RHEL, CentOS, Windows),  interconnects (NUMAlink, InfiniBand, GigaEthernet) or topologies (hypercube, all-to-all, fat-tree, single or dual rail).

Wolfgang:  Many clouds offer Software as a Service (SaaS) and Infrastructure as a Service (Iaas). You are now offering a new kind of service model – Expertise as a Service (EaaS). Why do you think there is a need for this new service model?

Christian: EaaS is the consultative component of our HPC Cloud that brings real value to our computational science and engineering customers. We currently offer over 20 technical applications in the six HPC domains mentioned above. When we asked one of our primary ISV partners if they would work with a service like Amazon EC2’s new HPC service, they declined because they don’t have the in-house expertise to help their customers.  SGI Cyclone can offer their software because we have a team of technical application engineering experts who for many years have been supporting the optimization and benchmarking of their software on our hardware systems. So it is logical by extension that we now offer our customers this Expertise as a Service (EaaS) model.

The other rationale for this EaaS offering is to enable a wider adoption of HPC in smaller and medium size companies.  Analysts estimate that 100,000 companies exist in the manufacturing sector in the US that could use simulation technology as an intellectual amplifier in advancing their product designs, but cannot afford to do so because of the lack of (a) funds to acquire a cluster, (b) funds to buy the necessary software licenses, (c) IT expertise to operate the cluster, or (d) application expertise with commercial applications. Cyclone, with our extensive server infrastructure, software, and expertise as a service enables these companies to now get easy access to HPC platforms and applications. 

Wolfgang:  How does EaaS work?

Christian: Let’s take for example a small or medium size business that is running a package like LS-DYNA on workstations to perform structural analysis of a new product they are designing.  They are under a tight deadline to deliver their results and they need to greatly accelerate the turnaround time on their job runs. They could look into purchasing a cluster, but capital budgets are tight and they don’t have the IT expertise or bandwidth to set up and run a cluster. At some point they consider SGI Cyclone, where they can talk with an LS-DYNA applications expert, who not only helps them determine the size of the system they will need to quickly run their jobs, but they will also walk them through the process and, if requested, load and launch their jobs for them. A set of simulations that takes three weeks on a quad core workstation might only take 12 hours on a 256 core SGI Altix ICE 8400 Infiniband cluster. The customer gets the results they need quickly and efficiently, without having to go through the hassle of buying hardware or the added expense of extra yearly software licenses. They only pay for what they used to run their simulations.

Wolfgang:  What are some of the challenges that need to be addressed to run HPC workloads in the cloud?

Christian: On the hardware side, most cloud vendors offer virtualized instances on servers with limited scalability, memory allocation and lack of user control over node interconnect topology, which leads to unacceptable MPI latency while running many technical applications.  SGI Cyclone squarely addresses these issues. We use virtualization technology only in the login/management node layer of the platform. We then provide bare metal access (I call it ‘physicalization’) to run their applications on our scale-out clusters, scale-up shared memory systems, or our hybrid clusters with accelerators. 

On the software side, we have also found that many 3rd party commercial ISV’s fear that providing their software via the cloud will crater their existing annual licensing revenue. We have been encouraging our software partners to experiment with this new business model by working with existing customers who already own annual licenses and provide them with easy access to purchase additional licenses via Cyclone or directly from the ISV.  Some ISVs get it, while others are taking a wait and see stance.

Wolfgang:  Similar to ISVs, aren’t you concerned that selling compute cycles in the cloud will erode SGI product revenue?

Christian: This is not what we have observed to date.  For us, it is about customer choice. We ask our customers the following three simple questions: “What problem are you trying to solve, how much equipment do you feel you need today and will need in the future, and when do you need it?”  We have a robust ‘build-to-order’ data center and modular data center business, and are participating in the current tech refresh cycle that is happening at large Internet companies and within financial services and virtualized cloud IT centers.

We are selling our new shared memory Altix UV and the latest version of our SGI Altix ICE scale-out clusters to our government, research and education, manufacturing, and energy customers.  With Cyclone we complete the need.  If our customers need a bridge to keep working as they wait to receive their newly acquired SGI platform they can use Cyclone. If they need to test new SGI technology before buying they can use Cyclone.  If they need to combine on-site compute resources sized for an average workload with cloud-bursting capability, they can use Cyclone.  We help them achieve these goals. And finally, if a new customer has a tight capital budget and they don’t have the IT expertise to set up and run a cluster, we can help them with our Cyclone Expertise as a Service (EaaS).

Wolfgang:  What do you see the role of cloud computing in the future IT infrastructure?

Christian: Cloud computing is morphing the client-server model. On the server side, the path is going from supercomputers to datacenters to co-located datacenters to the ubiquitous use of the cloud. On the client side, the world is moving from the workstation/PC to netbooks, tablets and location-aware smart phones. I think this trend leads to fewer, very large data centers with their own co-located power plants providing cloud access to millions and eventually billions of mobile clients. The new smart phones coming onto the market will have enough compute capabilities to replace the business notebook and will plug into a docking station to perform basic office work using cloud-based applications and storage. 

Of course, there are challenges that need to be addressed, the most important being inexpensive sustainable power for these mega datacenters, as well as the access to and protecting the security of mobile business data.

Dr. Wolfgang Gentzsch is the General Chair for ISC Cloud’11, http://www.isc-events.com/cloud11/

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire