Simplifying the Hectic World of HPC

By Gabor Samu

May 4, 2018

In the early days of computing, the goal was to automate and speedup the job being done by human calculators.  Early computers were large and complex devices, for the period, and programmers and operators required an intimate knowledge of the workings of these machines to achieve the desired results.  Indeed, in works that describe the early days of computing such as Turing’s Cathedral: The Origins of the Digital Universe by George Dyson, we get an intimate look at and appreciation for the computing skills those early pioneers had – in addition to their domain specific knowledge. Dyson writes: “The ENIAC was programmed by setting banks of 10-position switches and connecting thousands of cables by hand.  Hours, sometimes days, were required to execute a programming change” (75).

Fast forward to today, and supercomputers or high-performance computing (HPC) clusters have evolved to have massive computational abilities – and this power is applied to a mind-boggling array of challenges across numerous industries, with an increasing focus on artificial intelligence (AI).  As user communities of these environments grow, so do the challenges in bringing to bare these crucial resources to best serve the needs of an organization.

HPC environments today are rapidly growing in scale and complexity.  Today it’s common to find high-end computing environments containing thousands of nodes, with hundreds of users running millions of jobs per day.  Computing environments are frequently heterogeneous in nature, containing a mix of processors, accelerators, interconnect, and storage types all with the goal of delivering performance.  From this complexity, three primary concerns arise – usability, efficiency, and oversight.

The User Experience

As high-performance computing has evolved, so have the people that make use of it. Scientists and engineers today are highly focused on research and discovery and are not experts on computer hardware like their predecessors were. For them, HPC is a means to delivering value to the business.  From punch cards to keyboards, mice to touch screens, there has been a significant evolution of technology user interfaces over the decades.  Today, we are the swipe, pinch and drag generation.  Devices ranging from mobile phone, to tablets to personal computers all make it easier than ever to interact with systems.  Why should users of HPC have it any different?  Indeed, organizations today with an investment in HPC, need to take into consideration not only raw FLOPS, but also the usability of the environment.  Users need to be able to submit & manage their work simply, and reliably – regardless of where they are.

Walking the Tightrope

Getting access to resources in a timely fashion is often a tug-a-war in HPC environments.  Aligning business needs with demands from the varying projects, groups, users can be complex to say the least.  The competitive nature of business today precludes relying on first come, first served for running your HPC jobs.  Overall the infrastructure needs to be kept at a boil, while keeping your user community cool.  Like the proverbial traffic cop, HPC environments require intelligence to make workflows flow, push projects, eliminate workload and data traffic jams – all to keep the business happy.

Demystifying Your Environment

We’ve discussed the inherent complexity in HPC environments and the importance of not only using the resources in an intelligent manner that’s well aligned to the business, but also making those resources as easily accessible as possible to your user community.  Now, let’s shift our focus to the heroes responsible for keeping the environments in top running condition – the administrators.  Administrators are the ones we call when jobs fail – which can be for any number of reasons from a hardware fault, insufficient software licenses, to a full filesystem.  Ideally, HPC environments require tools for insight which tie together both infrastructure, job, and software license details for a comprehensive view.  This can enable administrators to quickly get users back up and running when time counts.

Certainly, one could opt for a collection of different tools to fulfill the above requirements.  But this often requires additional work around integration, and results in having several different support providers when things go wrong.  Given these considerations, a holistic software solution for managing a high-performance computing cluster can provide significant benefits.

IBM Spectrum LSF Suites is a tightly-integrated, systems and HPC workload management solution which provides leading scheduling efficiency and a simplified user experience for administrators and users alike – all rolled into one.  Built upon 25 years of expertise in workload scheduling and backed by IBM, Spectrum LSF Suites grow with your business needs – with 3 available editions for progressively larger sites with an increasing level of capabilities.  Get piece of mind in a hectic HPC world with IBM Spectrum LSF Suites.

IBM Resources

Follow @IBMSystems

IBM Systems on Facebook

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Oracle Cloud Now Offers AMD Epyc Compute Instances

October 23, 2018

Even as a press report yesterday declared that Intel has abandoned its current effort to produce a 10nm chip – a report denied by Intel – looming rival AMD and Oracle today announced the availability of the first AMD Epyc processor-based instance on Oracle Cloud Infrastructure. Read more…

By Doug Black

Scripps, Nvidia Tackle AI Tools and Best Practices for Genomics and Health Sensors

October 23, 2018

Nvidia and the Scripps Research Translational Institute today announced a collaboration to develop AI and deep learning best practices, tools and infrastructure to accelerate AI applications using genomic and digital hea Read more…

By John Russell

Automated Optimization Boosts ResNet50 Performance by 1.77x

October 23, 2018

From supercomputers to cell phones, every system and software device in our digital panoply has a growing number of settings that, if not optimized, constrain performance, wasting precious cycles and watts. In the f Read more…

By Tiffany Trader

HPE Extreme Performance Solutions

One Small Step Toward Mars: One Giant Leap for Supercomputing

Since the days of the Space Race between the U.S. and the former Soviet Union, we have continually sought ways to perform experiments in space. Read more…

IBM Accelerated Insights

Energy Matters: Evolving Holistic Approaches to Energy and Power Management in HPC

Energy costs of running clusters has always been a consideration when operating an infrastructure for high-performance computing (HPC).  As clusters become larger in the drive to the next levels of computing performance, energy efficiency has emerged as one of the foremost design goals.  Read more…

South Africa CHPC: Home Grown Dynasty

October 22, 2018

Before the build up to the final event in the 2018 Student Cluster Competition season (the SC18 competition in Dallas), I want to take a moment to write about one of the great inspirational stories of these competitions. Read more…

By Dan Olds

Automated Optimization Boosts ResNet50 Performance by 1.77x

October 23, 2018

From supercomputers to cell phones, every system and software device in our digital panoply has a growing number of settings that, if not optimized, constrain  Read more…

By Tiffany Trader

South Africa CHPC: Home Grown Dynasty

October 22, 2018

Before the build up to the final event in the 2018 Student Cluster Competition season (the SC18 competition in Dallas), I want to take a moment to write about o Read more…

By Dan Olds

Penguin Computing Launches Consultancy for Piecing AI Strategies Together

October 18, 2018

AI stands before the HPC industry as a beacon of great expectations, yet market research repeatedly shows that AI adoption is commonly stuck in the talking phas Read more…

By Tiffany Trader

When Water Quality—Not Quantity—Hinders HPC Cooling

October 18, 2018

Attention has been paid to the sheer quantity of water consumed by supercomputers’ cooling towers – and rightly so, as they can require thousands of gallons per minute to cool. But in the background, another factor can emerge, bottlenecking efficiency and raising costs: water quality. Read more…

By Oliver Peckham

Paper Offers ‘Proof’ of Quantum Advantage on Some Problems

October 18, 2018

Is quantum computing worth all the effort being poured into it or should we just wait for classical computing to catch up? An IBM blog today posed those questio Read more…

By John Russell

Dell EMC to Supply U Michigan’s Great Lakes Cluster

October 16, 2018

The University of Michigan (U-M) today announced Dell EMC is the lead vendor for U-M’s $4.8 million Great Lakes HPC cluster scheduled for deployment in first Read more…

By John Russell

Houston to Field Massive, ‘Geophysically Configured’ Cloud Supercomputer

October 11, 2018

Based on some news stories out today, one might get the impression that the next system to crack number one on the Top500 would be an industrial oil and gas mon Read more…

By Tiffany Trader

Nvidia Platform Pushes GPUs into Machine Learning, High Performance Data Analytics

October 10, 2018

GPU leader Nvidia, generally associated with deep learning, autonomous vehicles and other higher-end enterprise and scientific workloads (and gaming, of course) Read more…

By Doug Black

TACC Wins Next NSF-funded Major Supercomputer

July 30, 2018

The Texas Advanced Computing Center (TACC) has won the next NSF-funded big supercomputer beating out rivals including the National Center for Supercomputing Ap Read more…

By John Russell

IBM at Hot Chips: What’s Next for Power

August 23, 2018

With processor, memory and networking technologies all racing to fill in for an ailing Moore’s law, the era of the heterogeneous datacenter is well underway, Read more…

By Tiffany Trader

Requiem for a Phi: Knights Landing Discontinued

July 25, 2018

On Monday, Intel made public its end of life strategy for the Knights Landing "KNL" Phi product set. The announcement makes official what has already been wide Read more…

By Tiffany Trader

CERN Project Sees Orders-of-Magnitude Speedup with AI Approach

August 14, 2018

An award-winning effort at CERN has demonstrated potential to significantly change how the physics based modeling and simulation communities view machine learni Read more…

By Rob Farber

House Passes $1.275B National Quantum Initiative

September 17, 2018

Last Thursday the U.S. House of Representatives passed the National Quantum Initiative Act (NQIA) intended to accelerate quantum computing research and developm Read more…

By John Russell

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

By John Russell

D-Wave Breaks New Ground in Quantum Simulation

July 16, 2018

Last Friday D-Wave scientists and colleagues published work in Science which they say represents the first fulfillment of Richard Feynman’s 1982 notion that Read more…

By John Russell

Leading Solution Providers

HPC on Wall Street 2018 Booth Video Tours Playlist

Arista

Dell EMC

IBM

Intel

RStor

VMWare

TACC’s ‘Frontera’ Supercomputer Expands Horizon for Extreme-Scale Science

August 29, 2018

The National Science Foundation and the Texas Advanced Computing Center announced today that a new system, called Frontera, will overtake Stampede 2 as the fast Read more…

By Tiffany Trader

HPE No. 1, IBM Surges, in ‘Bucking Bronco’ High Performance Server Market

September 27, 2018

Riding healthy U.S. and global economies, strong demand for AI-capable hardware and other tailwind trends, the high performance computing server market jumped 28 percent in the second quarter 2018 to $3.7 billion, up from $2.9 billion for the same period last year, according to industry analyst firm Hyperion Research. Read more…

By Doug Black

Intel Announces Cooper Lake, Advances AI Strategy

August 9, 2018

Intel's chief datacenter exec Navin Shenoy kicked off the company's Data-Centric Innovation Summit Wednesday, the day-long program devoted to Intel's datacenter Read more…

By Tiffany Trader

Germany Celebrates Launch of Two Fastest Supercomputers

September 26, 2018

The new high-performance computer SuperMUC-NG at the Leibniz Supercomputing Center (LRZ) in Garching is the fastest computer in Germany and one of the fastest i Read more…

By Tiffany Trader

MLPerf – Will New Machine Learning Benchmark Help Propel AI Forward?

May 2, 2018

Let the AI benchmarking wars begin. Today, a diverse group from academia and industry – Google, Baidu, Intel, AMD, Harvard, and Stanford among them – releas Read more…

By John Russell

Houston to Field Massive, ‘Geophysically Configured’ Cloud Supercomputer

October 11, 2018

Based on some news stories out today, one might get the impression that the next system to crack number one on the Top500 would be an industrial oil and gas mon Read more…

By Tiffany Trader

Aerodynamic Simulation Reveals Best Position in a Peloton of Cyclists

July 5, 2018

Eindhoven University of Technology (TU/e) and KU Leuven research group conducts the largest numerical simulation ever done in the sport industry and cycling discipline. The goal was to understand the aerodynamic interactions in the peloton, i.e., the main pack of cyclists in a race. Read more…

No Go for GloFo at 7nm; and the Fujitsu A64FX post-K CPU

September 5, 2018

It’s been a news worthy couple of weeks in the semiconductor and HPC industry. There were several HPC relevant disclosures at Hot Chips 2018 to whet appetites Read more…

By Dairsie Latimer

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This