People to Watch 2024 – Todd Gamblin

Todd Gamblin

Distinguished Member of Technical Staff in the Livermore Computing Division at LLNL​

Congratulations on your selection as a 2024 HPCwire Person to Watch. Let’s start with some history: you are known as the creator of the Spack HPC package management tool. What was the genesis of this effort? What problem were you trying to solve?

Spack was created out of frustration with the state of building software on HPC systems. HPC has always been a bleeding-edge field, and the open-source software we’ve come to know and love isn’t always well supported. Most of the time, you want to build your software yourself with a fast compiler, maybe some optimized system libraries, with some “special” vendor programming environment. The odds that the developer of the software tested with your particular system configuration are really low when there’s only one system like yours.

In 2013, I had a graduate student who had to build 10 or 12 libraries to get her project working on a Blue Gene/P system at LLNL. It was tedious and repetitive. It wasn’t like people hadn’t built these packages before — they were widely used libraries. While many HPC people knew how to hack something together manually to get the build working, there wasn’t a great way to encode all the special hacks, patches, and dependency relationships in a way that was reusable across HPC systems. People were reinventing the wheel over and over again, and Spack was made to solve that problem.

When we first released Spack as open source in 2014, it had around 170 package recipes, and that’s grown in ten years to over 7,700. As we’ve added more and more packages, we’ve realized the problem we’re really trying to solve is modeling software. Not just reproducing builds, but understanding what happens if I want to use one library instead of another — how must the versions and configurations of everything else change? We’re now modeling compiler runtime libraries, ABI compatibility, build and run environments, etc. We want Spack to be able to configure the whole software stack, down to the kernel interface. This is going to be increasingly relevant as HPC and AI hardware becomes even more diverse.

How have you found managing such a large open source project? Has management overhead become a large part of your effort?

At this point, over 1,300 people have contributed to Spack, and we get anywhere from 300 to 600 contributions each month from around 150 different people. I never expected the project to grow so much. Sometimes it’s stressful, particularly when things break, and I’ve had to learn a lot about mail filtering. But I have always wanted to build something widely used, so it’s both humbling and fulfilling to see so many people involved in Spack. The continued activity on the project is a huge motivator for me.

Management has become a large fraction of my time on Spack, and that’s good for ensuring its sustainability. We have a team of five developers (including me) at LLNL, two full time contractors, and several staff at Kitware working on the project. We’re very lucky to have funding from DOE (ASC, ECP, LLNL, and ASCR), strong partnerships with Kitware, U. Oregon, and ParaTools, along with a strong base of contributors from too many organizations to list here. AWS has also been extremely generous in providing credits for testing — our Continuous Integration (CI) system is its own large project at this point.

My job these days is to try to balance the needs of all our stakeholders with the maintenance overhead of the project. Given the diversity of the user base — it’s almost evenly split between end users, software developers, and system administrators — it can be hard to satisfy everyone. I try hard to ensure that we’re not only advancing the list of available packaged software, but also innovating on features that empower users and enable us to maintain the project more efficiently in the long-term. I still have a hand in many of the technical aspects of the project, in addition to funding and project management.

At SC23 you announced the intention to form a High Performance Software Foundation (HPSF) as part of the Linux Foundation. How has the idea of a foundation been received?

The HPSF was driven by my desire to expand the stakeholder base for HPC (and HPC-adjacent) software. I’m co-organizing HPSF with Christian Trott, who leads Kokkos, another open source HPC project that has seen a lot of adoption. The HPC community writes an enormous amount of open source, and we want to make it easier for industry, academia, and other communities to adopt it, rely on it, and contribute back. This is particularly important as computational workloads like HPC and AI become more common in industry and the cloud.

We started to see the power of a strong community software model under the U.S. Exascale Computing Project (ECP). ECP pushed DOE software projects to use better software development practices as well as to integrate, test, and distribute packages, and it generated a lot of external interest and contribution. Now, with ECP over, HPSF is a way for us to bring in an even broader set of collaborators to work on a larger body of software. We won’t have anything close to the level of funding that ECP did, but if we can bring the community together around software and help developers, I think we can have a large impact.

Others seem to agree, as HPSF has been very well received so far. When we presented it at the DOE booth at Supercomputing, we were surprised to see over a hundred people gathered for a booth presentation that wasn’t even on the official program. We’re continuing to get interest from prospective member organizations and OSS projects as we conduct our formation discussions. DOE (both NNSA and Office of Science) have also been very supportive of our efforts on HPSF.

In regards to the High Performance Software Foundation, is there anything that users, application writers, or vendors can or do at this point to help?

Right now we’re working through the onboarding process for anchor members and projects, and we’re getting everyone on the same page around the project’s main funding charter. Interested developers who want to contribute their projects to HPSF and prospective member organizations should get in touch with us, as should vendors and other companies who want to join as members. See our contact information at hpsf.io. Users who want to get involved in working groups and other foundation activities should wait until we’re really rolling, after the official announcement this May.

Oh, and everyone should plan to attend our HPSF BOF session at ISC in Hamburg!

What inspired you to pursue a career in STEM, and what advice would you give to young people who wish to follow in your footsteps?

Both my parents were in STEM. My mother is a microbiologist turned high school teacher. My father was a doctor, first in the Navy, followed by private practice as an endocrinologist. They inspired me to be curious, to keep learning, and to go into some field with a positive impact on the world. I was drawn to computers because I liked getting my hands dirty and building things — obsessively hacking on programs until they worked the way I thought they should was always a lot of fun for me. I spent some time in industry and realized I liked building big systems, and I think I ended up at LLNL after graduate school because it combines all these things.

I get to work on some of the world’s fastest machines, and I feel like my work has impact on both the broader scientific community through research and open source, and on LLNL’s national security mission by supporting our simulation codes and all the science they enable. I like the breadth of activities going on at LLNL and other national labs. I can attend a physics or math talk just as easily as I can write some code or talk about computer science. In many jobs it’s easy to get stuck in one place, but not this one.

My advice to young people getting into STEM is to find a problem you’re excited about solving and solve it for real. Build the thing that makes it not a problem for anyone, and figure out how to get other people interested in it. Don’t be bothered with what is and isn’t research, or what is or isn’t in your field; do what it takes to solve the problem. It’s likely that you’ll find even more interesting problems to solve along the way.

Outside of the professional sphere, what can you tell us about yourself – unique hobbies, favorite places, etc.? Is there anything about you your colleagues might be surprised to learn?

These days I spend most free time with my wife and our two daughters, ages three and six. I think I’ve acquired memberships to most of the Bay Area science museums, the Oakland zoo, and anything else I can make a day of with the kids. I was a swimmer back in my college days, though lately I’ve gotten into crossfit as it’s more convenient. My other major was Japanese, and I lived in Tokyo for a year as a software developer between undergrad and grad school. I’m lucky to be able to travel to Japan for work every year or so. I play Go fairly regularly. A couple months ago I started trying to learn speed cubing, but haven’t quite broken a minute yet. I’m trying to learn all the parts of CFOP but am still on F2L.

People to Watch 2024

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

HPCwire