Essential Analogies for the HPC Advocate or the Trouble with Trying to Explain HPC

By Andrew Jones, NAG

October 25, 2013

Following Part 1, here are some more analogies for HPC …

Duh! Clue’s in the name: Big computer

I see this in so many “Intro to HPC” type courses – defining HPC as a computer 1000x more powerful than a desktop computer. Or worse, a computer that costs several million dollars, requires a megawatt of power, and fills a room. For bonus points the weight of the machine or how much cooling water it churns can be used. This is not really an analogy – simply a statement of the fact that HPC usually involves extreme computer hardware (albeit a narrow definition of HPC). But the reader/listener is left clueless as to the reason why anyone would fill a room full of computers and stump up for a $1m/year electricity bill. In fact, I would go as far as to say that this type of description of HPC (“it’s a big computer”) should be banned from the repertoire of any HPC person wishing to retain the community respect. Unless used in conjunction with a solid and inspiring description of the purpose and benefits of HPC.

Not special, just normal: Library

One of the great HPC analogies I have heard is one that describes where HPC should sit in the make-up of R&D organizations, especially universities. This one says that HPC should occupy the same position in any research organization (university) as a library – i.e., a core part of the essential infrastructure and a research tool that can be turned to many projects. A university for the last few centuries without a library? As silly as a modern R&D organization without access to HPC facilities. There are tiers of libraries too. Supporting the university library are national libraries with greater breadth of material. Equally important are the local research group libraries with much more specialized texts that may not be found in the larger more general purpose libraries. And the local libraries have a lower barrier to access. I’m sure the reader can work out the analogies to the traditional pyramid of HPC tiers.

Imagine a silly task: Aircraft vs. Car

One of the favorite hunting grounds for HPC analogies is explaining the nature and usefulness of the capability vs. capacity distinction. First, let me get a common mistake out of the way – I often see people trying to describe capability as the role of a supercomputer and capacity as the role of a cluster. There is no reason why a well architected commodity cluster cannot do capability computing and certainly poorly implemented supercomputers can be useless for capability work.

Usually we start by asking the reader/listener to imagine a task that needs doing/solving. Let’s say we have to move a thousand shoe boxes from one city to another. We can load up a car (or a group of cars if we have a team of willing friends) with boxes and drive them to the new location, and repeat as needed. As the problem gets bigger (more boxes or more distant cities) the cars take longer to complete the task, or more cars are needed. However the cars can still do the job. Now, imagine the destination city is across an ocean. It doesn’t matter how many cars are put on to the job or how much time is allocated, the cars cannot move the boxes across the ocean. But a cargo airplane can. This is capability – a job that cannot be achieved without that platform.

In HPC, capability computing jobs are those that cannot be completed by waiting longer or using a collection of smaller resources. This is often equated to jobs that require the use of the whole supercomputer (or half of it or some other large fraction) – but this is not a general answer to capability. Capability might only require a small fraction of the machine, but needs some special features it has. And not all jobs that use the full size of a system are capability jobs. There is also a great derived analogy – the aircraft can be used for both jobs (assuming availability of runways etc.). And so a capability computing system can be used for capacity work too – but the reverse is not true. Although of course, a system designed for capability might not be as cost-effective when used for capacity workloads.

Monuments: Ecosystems

Another aspect of HPC that cries out for effective analogies is the need to explain why supercomputing needs proper resourcing – i.e., people and software, not just a room filling lump of silicon and copper. One impactful analogy I have heard is to describe supercomputers purchased or deployed without adequate matching investment in software and people as “monuments.” Great to look at, but not very functional. One analogy is to consider a long haul passenger airplane. To deliver its mission, the airplane must be supplemented by an entire ecosystem of pilots, cabin crew (or flight attendants if on a US-based airline), runways, passenger terminals, air traffic control, processes/procedures, etc.

Likewise, HPC needs an ecosystem of people, software, datacenters, I/O subsystems, etc., to deliver its mission. And just like air travel, much of the complexity is in the ecosystem beyond the hardware product. And, here is the important bit, the differentiation and economic impact comes from getting the ecosystem right. Airlines have the same aircraft as their competitors just as companies normally have access to the same HPC technology as their competitors. But, how the staff interacts with the customers, quality of the back-end support, the processes/policies – these are what distinguish one airline from another. Likewise, the software, the support staff, the policies, etc., are what enables each company to gain a competitive advantage over their peers who may be using the same HPC technology.

The HPC Hotel

This analogy is great for explaining many different HPC concepts. Imagine your job is to refurbish a hotel. Clearly this task is easier if you have additional workers – more people means the job can be done quicker. And you can accept contracts to refurbish bigger hotels. But you need to coordinate all these extra workers of course. I’m sure you can see the use of this analogy for explaining parallelism and scalability (decomposition, coordination, scheduling conflicts, resource contention, etc.). You can also use it to introduce special vs. general purpose processors (everyone can do any job vs. combination of plumbers, electricians, plasterers, etc.). It can be used to explain that a variety of skills are needed to make the refurbishment (HPC simulation) effective.

The HPC hotel analogy can be used to show that the job of running a hotel is not the same as the job of designing a hotel is not the same as the job of building/refurbishing a hotel is not the same as staying in a hotel. In the same way, it is silly that one person expects to be expert in using HPC, and writing the applications, and running the cluster, and designing the cluster, and so on. The analogy can also be used to describe areas of differentiation – hotels (HPC) can differentiate from each other on both the rooms (hardware) and the services/staff/policies (support & software).

So, there you go – a light touch run-through of some common HPC analogies. What analogies do you use to describe HPC? Which ones have you found through feedback to be effective? Which ones are best left with those packing boxes that have been in the corner of the datacenter since before anyone can remember?

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

At ISC – Goh on Go: Humans Can’t Scale, the Data-Centric Learning Machine Can

June 22, 2017

I've seen the future this week at ISC, it’s on display in prototype or Powerpoint form, and it’s going to dumbfound you. The future is an AI neural network designed to emulate and compete with the human brain. In thi Read more…

By Doug Black

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “pre-exascale” award), parsed out additional information about the upc Read more…

By Tiffany Trader

GPUs, Power9, Figure Prominently in IBM’s Bet on Weather Forecasting

June 22, 2017

IBM jumped into the weather forecasting business roughly a year and a half ago by purchasing The Weather Company. This week at ISC 2017, Big Blue rolled out plans to push deeper into climate science and develop more gran Read more…

By John Russell

Intersect 360 at ISC: HPC Industry at $44B by 2021

June 22, 2017

The care, feeding and sustained growth of the HPC industry increasingly is in the hands of the commercial market sector – in particular, it’s the hyperscale companies and their embrace of AI and deep learning – tha Read more…

By Doug Black

HPE Extreme Performance Solutions

Creating a Roadmap for HPC Innovation at ISC 2017

In an era where technological advancements are driving innovation to every sector, and powering major economic and scientific breakthroughs, high performance computing (HPC) is crucial to tackle the challenges of today and tomorrow. Read more…

AMD Charges Back into the Datacenter and HPC Workflows with EPYC Processor

June 20, 2017

AMD is charging back into the enterprise datacenter and select HPC workflows with its new EPYC 7000 processor line, code-named Naples, announced today at a “global” launch event in Austin TX. In many ways it was a fu Read more…

By John Russell

Hyperion: Deep Learning, AI Helping Drive Healthy HPC Industry Growth

June 20, 2017

To be at the ISC conference in Frankfurt this week is to experience deep immersion in deep learning. Users want to learn about it, vendors want to talk about it, analysts and journalists want to report on it. Deep learni Read more…

By Doug Black

OpenACC Shows Growing Strength at ISC

June 19, 2017

OpenACC is strutting its stuff at ISC this year touting expanding membership, a jump in downloads, favorable benchmarks across several architectures, new staff members, and new support by key HPC applications providers, Read more…

By John Russell

Top500 Results: Latest List Trends and What’s in Store

June 19, 2017

Greetings from Frankfurt and the 2017 International Supercomputing Conference where the latest Top500 list has just been revealed. Although there were no major shakeups -- China still has the top two spots locked with th Read more…

By Tiffany Trader

At ISC – Goh on Go: Humans Can’t Scale, the Data-Centric Learning Machine Can

June 22, 2017

I've seen the future this week at ISC, it’s on display in prototype or Powerpoint form, and it’s going to dumbfound you. The future is an AI neural network Read more…

By Doug Black

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “pre-exascal Read more…

By Tiffany Trader

GPUs, Power9, Figure Prominently in IBM’s Bet on Weather Forecasting

June 22, 2017

IBM jumped into the weather forecasting business roughly a year and a half ago by purchasing The Weather Company. This week at ISC 2017, Big Blue rolled out pla Read more…

By John Russell

Intersect 360 at ISC: HPC Industry at $44B by 2021

June 22, 2017

The care, feeding and sustained growth of the HPC industry increasingly is in the hands of the commercial market sector – in particular, it’s the hyperscale Read more…

By Doug Black

AMD Charges Back into the Datacenter and HPC Workflows with EPYC Processor

June 20, 2017

AMD is charging back into the enterprise datacenter and select HPC workflows with its new EPYC 7000 processor line, code-named Naples, announced today at a “g Read more…

By John Russell

Hyperion: Deep Learning, AI Helping Drive Healthy HPC Industry Growth

June 20, 2017

To be at the ISC conference in Frankfurt this week is to experience deep immersion in deep learning. Users want to learn about it, vendors want to talk about it Read more…

By Doug Black

OpenACC Shows Growing Strength at ISC

June 19, 2017

OpenACC is strutting its stuff at ISC this year touting expanding membership, a jump in downloads, favorable benchmarks across several architectures, new staff Read more…

By John Russell

Top500 Results: Latest List Trends and What’s in Store

June 19, 2017

Greetings from Frankfurt and the 2017 International Supercomputing Conference where the latest Top500 list has just been revealed. Although there were no major Read more…

By Tiffany Trader

Quantum Bits: D-Wave and VW; Google Quantum Lab; IBM Expands Access

March 21, 2017

For a technology that’s usually characterized as far off and in a distant galaxy, quantum computing has been steadily picking up steam. Just how close real-wo Read more…

By John Russell

Trump Budget Targets NIH, DOE, and EPA; No Mention of NSF

March 16, 2017

President Trump’s proposed U.S. fiscal 2018 budget issued today sharply cuts science spending while bolstering military spending as he promised during the cam Read more…

By John Russell

HPC Compiler Company PathScale Seeks Life Raft

March 23, 2017

HPCwire has learned that HPC compiler company PathScale has fallen on difficult times and is asking the community for help or actively seeking a buyer for its a Read more…

By Tiffany Trader

Google Pulls Back the Covers on Its First Machine Learning Chip

April 6, 2017

This week Google released a report detailing the design and performance characteristics of the Tensor Processing Unit (TPU), its custom ASIC for the inference Read more…

By Tiffany Trader

CPU-based Visualization Positions for Exascale Supercomputing

March 16, 2017

In this contributed perspective piece, Intel’s Jim Jeffers makes the case that CPU-based visualization is now widely adopted and as such is no longer a contrarian view, but is rather an exascale requirement. Read more…

By Jim Jeffers, Principal Engineer and Engineering Leader, Intel

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

Facebook Open Sources Caffe2; Nvidia, Intel Rush to Optimize

April 18, 2017

From its F8 developer conference in San Jose, Calif., today, Facebook announced Caffe2, a new open-source, cross-platform framework for deep learning. Caffe2 is the successor to Caffe, the deep learning framework developed by Berkeley AI Research and community contributors. Read more…

By Tiffany Trader

Leading Solution Providers

MIT Mathematician Spins Up 220,000-Core Google Compute Cluster

April 21, 2017

On Thursday, Google announced that MIT math professor and computational number theorist Andrew V. Sutherland had set a record for the largest Google Compute Engine (GCE) job. Sutherland ran the massive mathematics workload on 220,000 GCE cores using preemptible virtual machine instances. Read more…

By Tiffany Trader

Google Debuts TPU v2 and will Add to Google Cloud

May 25, 2017

Not long after stirring attention in the deep learning/AI community by revealing the details of its Tensor Processing Unit (TPU), Google last week announced the Read more…

By John Russell

US Supercomputing Leaders Tackle the China Question

March 15, 2017

Joint DOE-NSA report responds to the increased global pressures impacting the competitiveness of U.S. supercomputing. Read more…

By Tiffany Trader

Groq This: New AI Chips to Give GPUs a Run for Deep Learning Money

April 24, 2017

CPUs and GPUs, move over. Thanks to recent revelations surrounding Google’s new Tensor Processing Unit (TPU), the computing world appears to be on the cusp of Read more…

By Alex Woodie

Russian Researchers Claim First Quantum-Safe Blockchain

May 25, 2017

The Russian Quantum Center today announced it has overcome the threat of quantum cryptography by creating the first quantum-safe blockchain, securing cryptocurrencies like Bitcoin, along with classified government communications and other sensitive digital transfers. Read more…

By Doug Black

DOE Supercomputer Achieves Record 45-Qubit Quantum Simulation

April 13, 2017

In order to simulate larger and larger quantum systems and usher in an age of “quantum supremacy,” researchers are stretching the limits of today’s most advanced supercomputers. Read more…

By Tiffany Trader

Messina Update: The US Path to Exascale in 16 Slides

April 26, 2017

Paul Messina, director of the U.S. Exascale Computing Project, provided a wide-ranging review of ECP’s evolving plans last week at the HPC User Forum. Read more…

By John Russell

Knights Landing Processor with Omni-Path Makes Cloud Debut

April 18, 2017

HPC cloud specialist Rescale is partnering with Intel and HPC resource provider R Systems to offer first-ever cloud access to Xeon Phi "Knights Landing" processors. The infrastructure is based on the 68-core Intel Knights Landing processor with integrated Omni-Path fabric (the 7250F Xeon Phi). Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Share This