Introducing the Spack Rolling Binary Cache hosted on AWS

Today we’re excited to announce the availability of a new public Spack Binary Cache, hosted on AWS. Spack users now have access to a public build cache hosted on Amazon Simple Storage Service (Amazon S3).

The use of this Binary Cache will result in up to 20x faster install times for common Spack packages. The work was a result of a collaboration between AWS, E4S, Kitware, and the Lawrence Livermore National Laboratory (LLNL) and is operated by the Spack open-source project team, which has done some amazing work supporting the HPC community.

Background

We often comment that HPC software is almost defined by its complexity. A large portion of the time, we’re talking about the truly vast dependency trees like the one in Figure 1.

*Figure 1: Spack build dependency graph for WRF with GCC and Open MPI on AWS*

Tracking the build dependencies of an application, installing, and maintaining them is a non-trivial task. Compiling them is also a complex procedure, full of nuanced application requirements, compiler flags, and optimizations. These build configurations make a significant performance difference when you’re running on specific CPU and GPU architectures. Before Spack, this whole process used to take days or weeks, and ate into a researcher’s time – getting in the way of the next discovery.

To AWS, this job of building software repeatedly (and reproducibly) looks like undifferentiated heavy lifting, and the Spack community thought the same. Spack is an open-source community project whose mission is to simplify the process of building these complicated stacks.

The Spack Binary Cache

Today at the International Supercomputing Conference (ISC’22) in Hamburg, the Spack team released version 0.18 of the Spack package manager, containing the Spack Rolling Binary Cache. This release adds a special new capability that significantly improves the installation times of common packages.

The Binary Cache will store pre-built versions of common libraries and application, dramatically reducing the installation time for most packages by up to 20x. This Binary Cache will be accessible to all Spack users, whether on-premises or in the cloud, and will contain builds for multiple compilers and architectures. Currently this consists of 700 distinct packages with two different operating systems, three architectures for a bit over 5100 total packages.

Spack simplifies building HPC codes by providing build recipes, dependency tracking, and provenance information. Spack makes the building, and subsequent management of HPC software stacks much simpler – however building complex software stacks is still a time-consuming exercise. This problem is sometimes exacerbated inside dynamic cutting-edge environments like the cloud because developers want to do complete rebuilds for new software releases, or even compiler versions, often immediately when they become available.

The Binary Cache lets you install a package based upon an existing installation, rather than having to recompile it from source. As Spack stores all the provenance information already, you can be sure you’re getting a build with the correct compiler, optimization flags and every dependency. This order-of-magnitude speedup for installing common packages enables builders to get running with their codes faster than ever.

By hosting this Binary Cache in Amazon S3 we are ensuring a scalable storage platform, allowing the cache to grow with the new releases and permutations. To maximize the availability of this valuable data we are deploying the Binary Cache via Amazon CloudFront. This enables regional caching of the data, resulting in higher bandwidth and lower latency accesses. This content delivery method is ideal for serving Spack’s global user base.

Automation to the rescue

To kick off the binary cache, the Spack team have populated it with over 700 common packages. Each package has been built with multiple compilers and for three different architectures – Intel, AMD and Arm64…

Read the full blog to learn more. Reminder: You can learn a lot from AWS HPC engineers by subscribing to the HPC Tech Short YouTube channel, and following the AWS HPC Blog channel.