Complex computational workflows involve sequences of tasks with dependencies between them. For example, biomedical research institutes and laboratories use workflows to process huge amount of genomic, clinical, and environmental data, link them based on data models and taxonomies, annotate them with machine or human knowledge, and then factor them into the equation of a common analytical platform to get scientific results. Design and manufacturing companies use workflows consisting of a multitude of complicated design processes, data and simulations to facilitate and accelerate product design and development.
There are several major challenges relating to the management and use of workflows in an HPC environment, which are detrimental to time to solution:
* Maintainability – Although scripting can be used as a means to codify workflows, scripts are difficult to update, debug and maintain over time.
* Collaboration – Organizations today frequently collaborate with external partners and they may not be able to run nonstandard workflows, leading to re-factoring of processes, with additional costs and delays.
* Oversight – Using multiple disparate tools does not provide a clear view of the state of the workflow during execution or a means of identifying errors should they arise.
* Efficiency – Underutilized resources due to ineffective scheduling leads to delays in results and increased costs.
Therefore, there are a number of key capabilities to consider when deploying a solution to manage and accelerate workflows in an HPC environment – ranging from creation of workflows through to advanced scheduling, monitoring and error mitigation. Here are some recommended best practices to drive overall productivity with complex computational workflows:
* Scalability – Provide the ability to seamlessly scale workflows to help speed results as new resources are added to HPC infrastructure.
* Portability – Ensure support for industry standard workflows to enable easy and rapid collaboration between organizations and their partners.
* Ease of use – Utilize a graphical editor for speeding the creation and updates of workflows.
* Error notification and mitigation – Provide automatic alerting when an error condition arises in a step of the workflow and the ability to restart workflows from the last known point when an error condition has been corrected.
IBM Spectrum LSF family provides advanced capabilities for the management of complex computational workflows – from creation through to execution and error handling. IBM Spectrum LSF Process Manager provides an intuitive drag and drop interface for designing reusable workflows based on business logic. Workflows can be easily updated or incorporated into new workflow definitions as subflows and powerful scheduling capabilities allow users to
schedule workflows at a specific time or to be triggered when one or more events occur. Users can monitor the progress of workflows via a graphical interface making it easy to understand the what step of the workflow is executing at any given point in time. Dependencies between the workflow steps are automatically managed and failure mitigation is configurable, allowing for unattended recovery. For example, a given workflow may fail at a certain step due to insufficient memory. Spectrum LSF Process Manager can detect the reason for the workflow step failure and re-queue the workflow with a larger memory requirement, or allow execute a user-defined post-failure script to take site specific actions.
The recently announced open source CWLEXEC tool enables workflows defined in the Common Workflow Language (CWL) specification to be run seamlessly in a Spectrum LSF cluster. This allows organizations to leverage portable CWL workflows, while taking advantage of the advanced workflow management capabilities provided by Spectrum LSF Process Manager including efficient and scalable scheduling and self-healing of workflows.
Combined with the leading scalable filesystem, IBM Spectrum Scale, Spectrum LSF provides an end-to-end solution for workflow metadata and full data provenance. By using Spectrum Scale extended attributes, Spectrum LSF enables users of workflows to traverse their workflow data generation path from output files back to input files, even after a given file has been modified or moved.
So, if you’ve decided to go with the (HPC) flow, give your users the tools they need to boost productivity and get results faster. Spectrum LSF family provides the tools you need to design complex distributed workflows, quickly, and to run them with confidence across your HPC
infrastructure. For a closer look at the powerful HPC workload management capabilities of the IBM Spectrum LSF family visit here.