HPC User Forum Explores Leadership Computing

By Nicole Hemsoth

September 30, 2005

At Oak Ridge National Laboratory this week, 131 HPC User Forum participants from  the U.S. and Europe discussed current examples of leadership computing and challenges in moving toward petascale computing by the end of the decade.

Vendor updates were given by Cray Inc., Hewlett Packard Co., Intel Corp., Level 5 Networks Inc., Liquid Computing Corp., Panasas Inc., PathScale Inc., Silicon Graphics Inc. and Voltaire Inc.

According to IDC vice president Earl Joseph, who serves as executive director of  the HPC User Forum, the buying power of users at the meeting exceeded  $1 billion. In his update on the technical market, he noted that revenue grew  49 percent during the past two years, reaching $7.25 billion in 2004. Clusters have redefined pricing for technical servers. The new IDC Balanced Rating tool  (www.idc.com/hpc) allows users to custom-sort and rank the performance of 2,500  installed HPC systems on a substantial list of standard benchmarks, including  the HPC Challenge tests.

Paul Muzio, steering committee chairman and vice president of government programs for Network Computing Services, Inc. and Support Infrastructure Director of the Army High Performance Computing Research Center, said the HPC User Forum's overall goal is to promote the use of HPC in industry, government and academia.  This includes addressing important issues for users. 

Jim Roberto, ORNL Deputy for Science and Technology, welcomed participants to  the lab and gave an overview. ORNL is DOE's largest multipurpose science laboratory, with a $1.05 billion annual budget, 3,900 employees and 3,000 research guests annually. A $300 million modernization is in progress. ORNL's new $65 million nanocenter begins operating in October and complements the lab's neutron scattering capabilities.

Thomas Zacharia, ORNL's associate director for Computing and Computational  Sciences, said computational science will have a profound impact in driving  science forward. ORNL, selected to be the DOE's main facility for Leadership Computing, plans to grow its machines to 100 teraflops, then to a petaflop by  the close of the decade. Researchers have made fundamental new discoveries with the help of the Cray X1 and X1E systems. The lab expects to put its Cray XT3 into production in the October-November timeframe. Based on estimates from  vendors, Zacharia expects a petascale system to have about 25,000 processors, 200 cabinets and power requirements of 20-40 megawatts.

According to Jack Dongarra, University of Tennessee, the HPC Challenge benchmark suite stresses not only the processors, but the memory system and interconnect. The suite describes architectures with a wider range of metrics that look at spatial  and temporal locality within applications. The goal is for the suite to take no  more than twice as long as Linpack to run. At SC2005, HPCC Awards sponsored by HPCwire and DARPA will be given in two classes: performance only and productivity (elegant implementation). Future goals are to reduce execution  time, expand the set to include additional things such as sparse matrix  operations, and develop machine signatures.

Muzio chaired a session on government leadership and partnerships, asking each speaker to comment on organizational mission, funding and outreach. Rupak Biswas, from NASA Ames Research Center, reviewed NASA's four mission directorates and said his organization, which hosts the Columbia system, has special expertise in  shared memory systems.

Cray Henry said the DoD High Performance Computing Modernization Program (HPCMP)  focuses on science and technology for testing and evaluation. HPCMP wants  machines in production within three months of buying them and uses funds for specific projects, software portfolios (applications development), partner universities, and the annual technology insertion process, which expends $40 million to $80  million per year to acquire large HPC systems for the HPCMP centers. The  program works with other agencies on benchmarking, partners with industry and  other defense agencies on applications development, and maintains academic  partnerships.

Steve Meacham said NSF wants input from the HPC community on how best to develop a census of science drivers for HPC at NSF, and on how the science community  would like to measure performance. NSF's goal is to create a world-class HPC environment for science. HPC-related investments are made primarily in science-driven HPC systems, systems software, and applications for science and  engineering research. In 2007, NSF will launch an effort to develop at least one petascale system by 2010 and invites proposals from any organization with  the ability to deploy systems on this scale.

Gary Wohl explained NOAA is a purely operational shop that does numerical weather prediction and short-term numerical climate prediction. The primary HPC goal is reliability for on-time NOAA products. NCEP and IBM share  responsibility for 99 percent on-time product generation. Changes in the HPC landscape  include greater stress on reliability, a dearth of facility choices, and  burgeoning bandwidth requirements.

In the ensuing panel discussion, participants stressed that the federal  government needs to recognize HPC as a national asset and a strategic priority. Non-U.S. panelists echoed the message.

Suzy Tichenor, vice president of the Council on Competitiveness, showed a video produced in collaboration with DreamWorks Animation to explain and  excite non-technical people about HPC. Meeting attendees applauded the video, which can be ordered at www.compete.org. Tichenor reviewed the Council's HPC  Project and its surveys that found, among other things, that HPC is essential to  urvival for U.S. businesses that exploit it.

DARPA's Robert Graybill updated attendees on the HPCS program, noting Japan plans to develop a petascale computer by 2010-2011 that will have a heterogeneous architecture (vector/scalar/MD).

In related presentations, Michael Resch of HLRS, Michael Heib from T-Systems and  Joerg Stadler of NEC described their successful partnership in Germany, which  includes a joint venture company to buy and sell CPU time and the innovative Teraflop Workbench Project, whose goal is to sustain teraflop performance on 15  selected applications.

Sharan Kalwani from General Motors reviewed the auto maker's business transformation, noting that GM is involved with one of every six cars in the world. Today, GM can predict how much compute time and money it will need to develop a new car. Senior management is convinced about the value of  HPC, Kalwani said.

David Torgersen's role is to bring shared IT infrastructures to Pfizer. Challenges include vendors selling directly to business units for point solutions that don't reflect the company's needs; differing business needs at various points in the drug development process; and the fact that grid technology is mature in some respects, not in others.

Jack Wells of ORNL, Thomas Hauser of Utah State, Jim Taft of NASA and Dean Hutchings of Linux Networx explored possibilities for partnering to boost the performance of the Overflow code on clusters. They explained why none of their organizations would do this on its own, then reviewed the challenges and potential next steps.

Jill Feblowitz of IDC's Energy Insights group said the financial health of the utility industry has been slowly improving since Enron. In contrast, the oil and gas industries have had a run-up in profits, although these profits have not yet translated into an increased appetite for technology and investments. The  Energy Policy Act of 2005 specifically includes HPC provisions for DOE. She described the concepts of “the digital oilfield” and the Intelligent Grid.

Marie-Christine Sawley, director of the Swiss National Supercomputer Center  (CSCS), described her organization and its successful, pioneering use of the HPC Challenge benchmarks in the recent procurement of a large-scale (5.7 teraflops) HPC system in conjunction with Switzerland's Paul Scherrer Institute.

Thomas Schulthess reviewed ORNL's material science work on superconductivity, which has revolutionary implications for electricity generation and transmission. Two decades after the discovery of higher-temperature  superconductors, they still remain poorly understood. Using quantum Monte Carlo techniques, the team of ORNL users explicitly showed for the first time that superconductivity is accurately described in the 2D Hubbard model.

Bill Kramer said NERSC focuses on capability computing, with 70 percent of its time going to jobs of 512 processors or larger. NERSC has won numerous awards for its achievements in the DOE's INCITE and predecessor “Big Splash” programs. In the related panel discussion, participants from industry, government and academia stressed the need for better algorithms and methods.

Frank Williams from ARSC is chair of the Coalition for the Advancement of  Scientific Computation, whose members represent 42 centers in 28 states. CASC disseminates information about HPC and communications and works to affect the national investment in computational science and engineering on behalf of all academic centers. Williams invited HPC User Forum participants to attend a CASC meeting and to contact him at [email protected].

IDC's Addison Snell moderated a panel discussion on leadership computing in academia. HPC leaders from the University of Cincinnati, Manchester University (UK), ICM/Warsaw University and Virginia Tech discussed their organizations, leadership computing achievements and the challenges of moving toward petascale computing. In another panel discussion moderated by Snell, HPC vendors debated the issues with cluster data management and what needs to be done to improve the handling of data in large HPC installations.

Phil Kuekes of HP gave a talk on molecular electronics and nanotechnology (“nanoelectronics”), summarizing HP's progress toward developing a nanoscale switch with the potential to overcome the limitations of existing IC circuit technology.

Muzio and Joseph facilitated a session on architectural challenges in  moving toward petascale computing. According to Muzio, an application engineer's ideal petascale system would have a single processor, uniform memory, Fortran or C, Unix, a fast compiler, an exact debugger and the stability to enable applications growth over time. By contrast, a computer scientist's ideal petascale system would have tens of thousands of processors, different kinds of  processors, non-uniform memory, C++ or Java, innovative architecture and radically new programming languages.

“Unfortunately, for many users, the computer scientist's system may be built in the near future,” he said. “The challenge is to build this kind of system but make it  look like the kind the applications software engineer wants.”

According to Robert Panoff of the Shodor Educational Foundation, math and science is more about pattern recognition and characterization than mere symbol manipulation. He said the lag time between discoveries and their application.

“The people who will use petascale computers are now in high school to grad school, while most of us are approaching retirement,”  he added. “You don't need petascale computing for this teaching, but this will help produce the  people needed to do petascale computing.”

David Probst of Concordia University argued scaling to petaflop capability cannot be done without embracing heterogeneity. Global bandwidth is the most critical and expensive system resource, so he said we need to use it well  throughout each and every computation. “Heterogeneity is a precondition for this in the face of application diversity, including diversity within a single application,” Probst added. “Every petascale application is a dynamic, loosely coupled mix of high thread-state, temporally local, long-distance computing and  low thread- state, spatially local, short-distance computing.”

Burton Smith, chief scientist at Cray, challenged the popular definitions of  “petascale,” “scale” and “local.” The popular definition of scale “doesn't mean  much, maybe that I ran it on a few systems and it seemed to go fast,” he said. “You  probably mean it message-passes with big messages that don't happen very often. Also, people say 'parallel' when they mean 'local.'” He concluded that parallel  computing is just becoming important; we know how to build good petascale  systems if the money is there; and sloppy language interferes with our ability  to think.

According to Michael Resch of HLRS, there needs to be a “move on from MPI to a real programming language or model. I hear people complaining about how hard it is to program systems with large numbers of processors. What about buying systems with a smaller number of more-powerful processors? Why not buy high-quality  systems?”

Muzio introduced the companion panel discussion on “software issues in moving  toward petascale computing” by reviewing the HPC User Forum's achievements in promoting better benchmarks and underscoring the limited scalability and capabilities of ISV application software.

Suzy Tichenor reviewed the Council on Competitiveness' recent “Study of ISVs  Serving the HPC Market: The Need For Better Application Software.” The study found the business model for HPC-specific application software has evaporated, leaving most applications unable to scale well. Market forces alone will not address this problem and need to be supplemented with external funding and expertise. Most ISVs are willing to partner with other organizations to accelerate progress.

DARPA's Robert Graybill said the HPCS program is looking at how to measure productivity, and that he believes new programming languages are needed. We need time to experiment before deciding which are the right HPC language  attributes. The goal by 2008 is to put together an industry consortium to pursue this. I/O is another major challenge.

BAE Systems' Steve Finn, chair of the HPCMP's User Advocacy Group, said continuous improvements are still occurring to legacy codes and large investments have been made in scalable codes. “We need to prioritize which  codes to rewrite first [for petascale systems],” he added. “UPC and CAF won't be the final  languages. It's good to try them out, but if you rewrite them now, you may need to rewrite them again in a few years.”

The next HPC User Forum meeting will take place April 10-12, 2006 in Richmond,  Va. The meetings are co-sponsored by HPCwire.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

EuroHPC Expands: United Kingdom Joins as 35th Member

May 14, 2024

The United Kingdom has officially joined the EuroHPC Joint Undertaking, becoming the 35th member state. This was confirmed after the 38th Governing Board meeting, and it's set to enhance Europe's supercomputing capabilit Read more…

Linux Foundation Announces the Launch of the High-Performance Software Foundation

May 14, 2024

The Linux Foundation, the nonprofit organization enabling mass innovation through open source, is excited to announce the launch of the High-Performance Software Foundation (HPSF). The announcement was made at the ISC Read more…

Nvidia Showcases Work with Quantum Centers at ISC 2024

May 13, 2024

With quantum computing surging in Europe, Nvidia took advantage of ISC 2024 to showcase its efforts working with quantum development centers. Currently, Nvidia GPUs are dominant inside classical systems used for quantum Read more…

ISC 2024: Hyperion Research Predicts HPC Market Rebound after Flat 2023

May 13, 2024

First, the top line: the overall HPC market was flat in 2023 at roughly $37 billion, bogged down by supply chain issues and slowed acceptance of some larger systems (e.g. exascale), according to Hyperion Research’s ann Read more…

Top 500: Aurora Breaks into Exascale, but Can’t Get to the Frontier of HPC

May 13, 2024

The 63rd installment of the TOP500 list is available today in coordination with the kickoff of ISC 2024 in Hamburg, Germany. Once again, the Frontier system at Oak Ridge National Laboratory in Tennessee, USA, retains its Read more…

Harvard/Google Use AI to Help Produce Astonishing 3D Map of Brain Tissue

May 10, 2024

Although LLMs are getting all the notice lately, AI techniques of many varieties are being infused throughout science. For example, Harvard researchers, Google, and colleagues published a 3D map in Science this week that Read more…

Shutterstock 493860193

Linux Foundation Announces the Launch of the High-Performance Software Foundation

May 14, 2024

The Linux Foundation, the nonprofit organization enabling mass innovation through open source, is excited to announce the launch of the High-Performance Softw Read more…

ISC 2024: Hyperion Research Predicts HPC Market Rebound after Flat 2023

May 13, 2024

First, the top line: the overall HPC market was flat in 2023 at roughly $37 billion, bogged down by supply chain issues and slowed acceptance of some larger sys Read more…

Top 500: Aurora Breaks into Exascale, but Can’t Get to the Frontier of HPC

May 13, 2024

The 63rd installment of the TOP500 list is available today in coordination with the kickoff of ISC 2024 in Hamburg, Germany. Once again, the Frontier system at Read more…

ISC Preview: Focus Will Be on Top500 and HPC Diversity 

May 9, 2024

Last year's Supercomputing 2023 in November had record attendance, but the direction of high-performance computing was a hot topic on the floor. Expect more of Read more…

Illinois Considers $20 Billion Quantum Manhattan Project Says Report

May 7, 2024

There are multiple reports that Illinois governor Jay Robert Pritzker is considering a $20 billion Quantum Manhattan-like project for the Chicago area. Accordin Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

How Nvidia Could Use $700M Run.ai Acquisition for AI Consumption

May 6, 2024

Nvidia is touching $2 trillion in market cap purely on the brute force of its GPU sales, and there's room for the company to grow with software. The company hop Read more…

Hyperion To Provide a Peek at Storage, File System Usage with Global Site Survey

May 3, 2024

Curious how the market for distributed file systems, interconnects, and high-end storage is playing out in 2024? Then you might be interested in the market anal Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Leading Solution Providers

Contributors

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

Intel Plans Falcon Shores 2 GPU Supercomputing Chip for 2026  

August 8, 2023

Intel is planning to onboard a new version of the Falcon Shores chip in 2026, which is code-named Falcon Shores 2. The new product was announced by CEO Pat Gel Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

How the Chip Industry is Helping a Battery Company

May 8, 2024

Chip companies, once seen as engineering pure plays, are now at the center of geopolitical intrigue. Chip manufacturing firms, especially TSMC and Intel, have b Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire