Addressing HPC’s Last-Mile Problem
Simplified access to shared HPC clusters
The last-mile problem is a recurring theme in business. In logistics and telecommunications, the last-mile problem refers (almost literally) to the difficulty of delivering goods or services from distribution hubs to where customers reside. HPC administrators face a similar challenge also as they struggle to simplify user access to HPC applications running in HPC data centers and various public clouds.
IT professionals often think about the last-mile problem in terms of bandwidth and latency, but while this is true, it is equally about ease of use and access. In this article, I’ll discuss the challenge of delivering HPC applications to client devices, and suggest solutions that can simplify the environment and make users more productive.
HPC applications are diverse
HPC applications can vary widely by industry and application. Common application patterns include serial and parallel jobs, parametric sweeps, interactive applications, long-running services, and multi-step workflows common in genomics or data analytics. Increasingly, HPC applications may run in containers or require specialized devices such as GPUs.
In the early days, HPC applications were often built in-house. In addition to having expertise in a particular field, users needed to know how to interact with a workload manager, transfer files, and have basic programming and scripting skills.
While some users will prefer to work at the level of the command line most HPC applications have evolved to present user-friendly interfaces. Ideally, a researcher or engineer should be immediately productive without needing to have Linux or programming skills.
Many ways to run HPC applications
Besides providing easier access to applications and applications, modern interfaces allow users to monitor progress, control jobs at run-time, access and manipulate simulation related data, and visualize and retrieve simulation results. Despite the availability of simpler interfaces, there are still many ways to access and manage HPC applications making environments complex for both users and administrators.
Often, the same application provides multiple interfaces and integration methods. For example, a data scientist can run an analytic model written in R (a popular statistical computing language) in several different ways. They might run a program from the command line using Rscript, run it from within RStudio on a client, access RStudio hosted on a remote web-server, or use a custom web interface or Jupyter notebook to execute a model and visualize results.
A mechanical engineer can run a structural analysis within ANSYS Mechanical on a client workstation, submit it through a web-based application interface (provided with IBM Spectrum LSF), or submit it manually by running a pre-written script. In all these cases the same simulation runs on the HPC cluster but depending on the application, context, and where data resides users may have good reasons to prefer one approach over another.
For HPC, Software-as-a-Service falls short
With so many applications and access methods, making HPC simple has been an elusive goal. Software-as-a-Service (SaaS) delivery models that simplify software access in other fields are less helpful here. While SaaS can work for a single application or suite, most HPC centers run many tools and change them constantly. Customers that go down this route can find themselves with multiple SaaS silos that are expensive, inflexible, and inhibit data sharing and collaboration.
Most of the action in HPC is around infrastructure (IaaS) and platform services (PaaS). Systems vendors offer HPC-ready infrastructure bundles with software tools that simplify cluster deployment and provide connectivity to public clouds. Cloud vendors provide HPC-friendly IaaS and PaaS offerings specific to their own clouds with tools and APIs that simplify the deployment and management of cloud compute instances and storage services.
Applying serverless computing concepts to HPC
To simplify HPC environments, serverless computing provides a useful model. In serverless environments (sometimes called Function-as-a-Service), the software still runs on servers, but infrastructure details are completely hidden. Users simply publish their function to a cloud service and call it whenever they want. Details such as provisioning and scaling containers and cloud instances, loading software, and routing and balancing requests are handled transparently by the cloud service.
Ideally, HPC applications should behave the same way, while still providing advanced users with flexibility. From the client desktop, running an application on an HPC cluster should be seamless. If I have an Abaqus input file on my desktop and want to run a parallel simulation in the cloud, I should be able to “right-click” on the file and run the workload.
Details such as moving data and input files, monitoring results, notifying me when the job is done and retrieving and presenting results should be seamless. In other words, just as serverless environments can simplify some applications, the HPC user experience should be “Clusterless.”
Clusterless computing for HPC
To facilitate easy access HPC applications, IBM Spectrum LSF Suite includes an application-centric user portal. IBM Spectrum LSF Suite provides a tightly integrated solution that delivers advanced workload and resource management capabilities on-premises and in the cloud, supporting transparent “bursting” of workloads to various public clouds based on configurable policies.
The integrated, application-centric user portal provides simplified access to applications, workloads, and data. Users interact with a tailorable web interface to run and monitor jobs and workflows, manage job-related data, and visualize and share simulation results.
The available IBM Spectrum LSF Application Center Desktop Client addresses the last-mile challenge, bringing this same functionality to the Microsoft Windows desktop without the need for a browser. With the desktop client, HPC users enjoy a seamless user experience. The desktop client:
- Manages authentication against remote clusters
- Exposes application-specific job interfaces natively on the client
- Transparently moves data to and from a remote cluster
- Provides local notification of changes in job status
- Provides visual monitoring and management of jobs and multi-step workflows
Users can still use a web interface or command line if they prefer, but from the perspective of the application user, it’s as though there is no cluster at all. Users simply see faster results and enjoy increased productivity without ever leaving their familiar desktop environment.
And for those times that you’re away from your desk, IBM Spectrum LSF Mobile Client available from the Apple App Store and Google Play provides similar job monitoring, control, and status notifications features on your handheld device. A short video below shows the Application Center Desktop Client in action.
You can learn more about IBM Spectrum LSF Suite by visiting https://www.ibm.com/marketplace/hpc-workload-management.