The story of NIH’s supercomputer Biowulf is fascinating, important, and in many ways representative of the transformation of life sciences and biomedical research into a hybrid discipline that is dependent upon advanced computational power and prolific data-generating instruments. When named in 1999 – yes this is Biowulf’s 20th birthday – it was a small cluster of 40 “boxes on shelves,” running CHARMm and BLAST, with 14 active users. A few papers (9) cited the HPC resource that year. So much for immediate impact.
Today Biowulf is a roughly 2 petaflops, general purpose HPC resource, with ~100,000 cores on diverse nodes (thin, thick, accelerated), 35 petabytes of storage, a high speed, InfiniBand-dominated 100 gig network, supporting 3000 active users. In 2018, it ran more than 34 million jobs and delivered more than a billion CPU hours and is on pace to match that this year. In 2019 additional SSDs have been deployed, allowing up to 2.4 TB of local scratch disk allocations, and the 2500th paper citing Biowulf was published. That’s impact.
Noteworthy, Biowulf cracked the Top500 in 2016 at #156 and rose to #66 in 2017 (the last time it ran Linpack) and remains on the list. (See HPC systems figure below; explanatory notes are at the end of article.)
It’s been a wild ride. Unlike the big machines at national labs which tend to sprout, enjoy a period of prominence, and then topple (decommission), Biowulf has become a living resource, evolving with the times. The latest chapter, Biowulf 2.0, comes to an end this summer in the sense of completing the most recent $70 million, five-year modernization. Today, Biowulf is the fastest supercomputer in the world that is solely designed for and dedicated to biomedical research. (You won’t be surprised to learn that Biowulf 3.0 planning has already begun.)
Let’s not overlook that HPC arrived at NIH reasonably early but without great fanfare or wide use. In 1986 NIH brought in a Cray X-MP/22, which was at the time the world’s fastest supercomputer. It had two processors that could be addressed by a single program and was used by a very few researchers, mostly at the National Cancer Institute (NCI), to study molecular structure and to do some image processing. Afterward, advanced computer infrastructure growth at NIH was irregular and somewhat modest including Biowulf’s beginnings in 1999.
The real fireworks started in 2013/2014 when the Biowulf 2.0 project was conceived and undertaken, not long after Andrea Norris joined NIH as the director at Center for Information Technology and CIO at NIH. “It was clear we were at the beginning of the data tsunami that was affecting biomedical research,” she recalled.
Sequencing the human genome (3.2 billion base pairs), completed circa 2001, is the watershed event most people point to when discussing biomedical research’s transformation into a digital science. HPC, writ broadly, was at least as important as the high-volume DNA sequencing machines from Applied Biosystems in accomplishing that goal. The AB sequencers sliced up the genome into small DNA fragments, amplified them, sequenced them, and read out the myriad sequenced fragments. Big computers sorted the fragments and stitched them together into the proper human genome. We’ll leave aside wrangling between public and private (Celera Genomics) efforts to finish the job. Then U.S. president Bill Clinton sort of declared the rough draft finished in 2001 and the feuding parties had little choice but to agree.
Without doubt the plummeting cost of sequencing technology and its rapid adoption jump-started the wide-spread use of advanced computing in life sciences, but many experimental life sciences technologies were also percolating at the same time. Chemical biology and molecular modeling (mostly scoring docking probabilities for assessing leads) had been kicking around for years and were becoming more sophisticated. An endless number of ‘omics’ – genomics, proteomics, metabolomics, are just the big three – was popping up. A variety of advanced microscopy technologies based on improved imaging, data mining, and most recently machine learning recently burst onto the scene. Systems biology, which attempts to integrate many of the new digital pieces of biology into useful simulation and prediction tools, was bubbling.
You get the idea. Lots of things were happening at once (and in IT as well). As Norris notes there was a growing avalanche of data spilling from new instruments that only advanced computers could manage and make sense of. Before plunging into technology choices NIH made for Biowulf 2.0, consider two examples to set the context of the times: one from the early “what’s a computer” days; and another from a “we better jump on board” moment which capture a compressed version of NIH’s perspective.
- Framingham Heart Study – As Vital Today as 70 Years Ago
The 70-year-old Framingham Heart Study, begun in 1948, is unique in the world and still going strong. Today it encompasses three generations of participants and is now led by Dr. Daniel Levy, senior investigator, at the National Heart, Lung, and Blood Institute (NHLBI).
“It’s quite a story of the evolution of the size and complexity of the data we have collected and the complexity of the types of analyses that have been conducted. When the study first began in the 1940s there were no computers available for the analysis of data. The data would be collected, much of it was punched into old IBM punch cards, and if you wanted to identify how many diabetics there were among the original 5000 participants you had to take those IBM punch cards and put them through a card sorter that would allow you to make a determination of the prevalence of diabetes in your study sample…To do something like find the mean value of mean blood pressure among study participants, they had an old adding machine and that was used to calculate things like means. This is a relatively primitive method,” said Levy.
Contrast that with what Levy and colleagues are now doing. “We published a paper in Nature Communications about a year ago. We looked at 71 cardiovascular disease protein biomarkers in 7000 Framingham participants and then we conducted genome wide association studies (GWAS) of each of those 71 proteins using millions of genetic variants on each individual, relating them each to each protein. A decade ago it might have been possible to do such an analysis for a single protein but daunting to think of doing it across 71 proteins. Today we are able to apply this kind of brute force analysis across 71 proteins that helped us identify genetic signals for circulating levels of these proteins and then link the genetic information on these proteins to identify proteins that may serve as causal biomarkers of cardiovascular risk.”
The Framingham Heart Study is indeed a rich resource and living instrument whose computational requirements have mushroomed just as they have throughout biomedical research and more recently in the clinic. Today, among other things, the Framingham Study is readying to tackle analysis of whole genome sequencing for thousands of individuals and those datasets will be immense.
- Where Have All the Applicants Gone
So while requirements for DNA sequencing and genomics research broadly comprised the point of the spear of biology’s thrust into computation, to some extent they also acted a bit like the canary in a coal mine for NIH with regard to identifying NIH’s declining attraction for top talent.
“Rolling the clock back to 2013, we came to a sober realization that we weren’t keeping up with the needs of the intramural program in the HPC space,” explained Dr. Andy Baxevanis, director of computational biology for the NIH Intramural Research Program and senior scientist at the National Human Genome Research Institute (NHGRI). “We did not have nearly enough horsepower to meet the demand of all of our investigators and we were starting to see the effects of that. Not acting would have had a devastating effect on our research program, and we were already seeing how we were starting to fall behind our peer institutions (government and academic) in that there was a measurable adverse effect on our ability to recruit and retain the best and brightest.”
NIH responded with the Biowulf 2.0 project, a five-phase project, with phases loosely corresponding to the years of implementation. Here’s Andrea Norris on the effort:
“Our objectives over the five-year period were to put in a modern architecture that had both the power and flexibility to meet the needs of intramural researchers across our 27 institutes and centers and a wide variety of different disease and health domains and across basic, translational and clinical research. [We wanted] to promote data sharing and scientific collaboration by having this resource centrally located on campus with 100 gigabit network connectivity to and from the labs and out through the internet. [We also wanted to provide] common application support, and now support a suite of more than 600 commonly-used applications and tools shared by all of our researchers and provide ample high availability storage.”
The Biowulf 2.0 project, launched in 2014, was intended to put NIH’s computational capabilities on equal footing (or better) with its peer institutions. Then as now, emphasizes Norris, science requirements drove HPC design and architecture choices.
“As a CIO and as a service provider, I keep an eye on the fringe technologies but what we [deploy] has got to be something that there’s demand for, that’s practical, and can be supported and sustained. But we are, for example, in the Exascale Computing Initiative and our role has been in giving them the requirements for the kind of research we would love to be able to do that we cannot,” said Norris. Following an extensive needs assessment by consulting firm, BioTeam, NIH worked with a systems integrator (Initially Computer Sciences Corp., which through M&As, is now General Dynamics Information Technology (GDIT)) to deploy Biowulf 2.0 over the next few years.
Moving Life (Sciences) into the Fast Lane
Interestingly the first item of business was improving the network. Data movement was a painful pinch point as more research groups brought in new lab equipment and generated more data. In conjunction with the Biowulf upgrade, NIH undertook an extensive network modernization, which Norris calls the linchpin enabling Biowulf.
The first step for Biowulf 2.0 was to upgrade existing Ethernet network, then 1-to-10 gig, up to 40 gigs. In 2016 the decision was made to go to InfiniBand to get to 100 Gbps.
“That necessitated getting gateways installed between the Ethernet and IB fabrics and that actually turned out to be a little more challenging than we first thought,” said Steve Fellini, lead technologist, high performance computing (HPC) at NIH Center for Information Technology. “Since phase two we have been expanding the IB fabric. With Phase five we will have reached capacity on the current IB fabric and we’ll be building out an aggregation layer above the current core switches. That’ll be an HDR-based fabric.”
Not surprisingly, the last mile is an issue for NIH (as it is for others.) “While NIH funded connections, fast connection to particular buildings, after that it is the responsibility of the individual institute to build out the network to the actual lab and that has been relatively slow going. So some of our users have better connectivity than others. We very much encourage people to use Globus with which we have had good luck,” said Fellini.
Norris noted, “We now have a 100 gig, very large, distributed, state of the art network to 100-plus labs and facilities here on campus and near campus. At the start of the project you couldn’t even track how data are moving through it. Now, we are moving about 6 petabytes of data a day and watching that increase each year increase. While we have an incredibly powerful NIH network, we still do struggle a bit with that last mile, so up to the workstation or the piece of scientific equipment that’s literally sitting in the lab.”
With Data Intensive Science Comes Lots of Data
Unsurprisingly, adding storage capacity was critical. A single cryo-em microscope, for example, can generate 5TB of data a day, and scientists are well-known for bringing in new instruments without sufficient regard for needed IT support. Indeed it’s a common, perhaps unavoidable refrain that IT refreshes can’t keep pace with life sciences instrument refreshes. In any case, the original Biowulf 2.0 goal was to get to 14 petabytes of storage; it has far exceeded (35 PB) that goal.
“We used to be surprised when we got requests for 100 GB of additional storage; now typically we’ll get requests for 10-20 TB. That’s not unusual and certainly guided our decision making as we were figuring out what we do next,” said Steve Bailey, chief, high performance computing (HPC) at the NIH Center for Information Technology.
Fellini added, “Each compute node has an SSD and we’ll often ask users to move as much I/O as possible to that scratch storage in order not to overload our network-based shared storage. Data to be retained can then be copied to shared space at job completion.”
“We made the decision to go with GPFS over Lustre some number of years ago. There are no thoughts of switching to Lustre,” said Susan Chacko, lead scientist, high performance computing (HPC) at the NIH Center for Information Technology, “There some questions about DDN and the relationship with IBM GPFS. We are tracking what changes there might be to licensing for GPFS.”
Looking ahead she said, “We are very much interested in solid state technology and have in house a small Vast Data cluster for storage data and so we’re evaluating that. Actually, we’re benchmarking the vast cluster as we speak. Soon we’ll be evaluating a DDN SFA18K as well.”
Norris takes a long-term view: “Data storage is a challenge for us, given the vast amounts of data that we’re using. NIH-wide archival and long-term storage strategies and approaches are much needed. This is an area we are going to spend more attention in this next Biowulf 3.0.”
Building the Core Compute in Every Way but Exotic
Expanding the core compute capacity throughout Biowulf’s history was an incremental process driven by pressing science needs with more nodes added yearly. In 2002, for example, 198 nodes including 24 nodes with 24 GB of memory were added. CPU’s were all x86-based (Intel and AMD). Sixteen pilot GPU nodes were added in 2010, by which time the total Biowulf core count was up to 9000. NIH provides an excellent Biowulf history timeline online that is fun to ramble through with click-throughs to points of interest.
Most of the big changes occur during the Biowulf 2.0 project. Thirty thousand cores were added in 2015. Another thirty thousand cores were added in 2016, including a bunch of K80 GPU nodes, and support for HPC container technology, Singularity. In 2017, 48 P100 Nvidia GPU nodes, each with 4 P100s, were added. Eight V100 nodes, again with four GPUs on each, were added in 2018.
The CIT HPC team does review emerging technologies, but again, tends to focus on what readily available and proven.
“In fact for phase 5, while we’ve been using Intel based chips for the last four or five years, this year we took a look at the AMD EPYC chip. While we were impressed with its performance, for various reasons we couldn’t get the packaging the way we needed it. So we expect in the next year or so it will be a viable alternative to Intel,” said Bailey.
IBM’s Power chip line is not seen as a likely option at this time according to Bailey: “Susan has a group of scientists to support with over 600 apps and having a mixed architecture say between (IBM) Power and Intel would not be very viable solution at this point just by the sheer number of applications that would need to be recompiled.”
“We have finished most of our interviews and requirements gathering and benchmarks. Now we are starting to do the analysis to sort through what are going the recommendations.”
Training – If You Build It, Will They Come?
Building a powerful HPC resource is one thing. Helping biomedical researchers, many of whom have limited computational expertise or training, make effective use of the resource is another. While IT expertise levels among researchers is changing, it remains a mixed bag:
- Still plenty of computer novices…Baxevanis noted many researchers, “don’t code and have never taken a computer science course, but they know that this is a resource they should be using to advance their research projects. To close this knowledge gap, the [CIT HPC team] had been offering in-person training sessions, but the classes were selling out so quickly they couldn’t keep up with demand, so the Biowulf team developed an online Introduction to Biowulf series, allowing more people to quickly come up-to-speed on using the HPC resources available to them. In addition, a very cool thing that the Biowulf team does is offer ‘coffee shop consults’ that are sprinkled around the Bethesda campus, where [CIT HPC team members] just hang out with our scientists who come with their questions and start banging out solutions right there on their laptops.”
- …But the number of HPC savvy ones is up. Levy added, “I can tell you that over the course of the last 8 years or so there’s been a dramatic evolution in the kinds of research I am doing, the kinds of researchers I am hiring as post-doctoral fellows and staff scientists. Many of the researchers on my team now are computational biologists, bioinformatics experts, systems biology researchers and we are dependent upon the computing resources.”
Chacko offers a balanced view: “I think [the situation] has changed significantly over the last 15 years. We used to think that workshops every two months was enough. 15 years ago, we would get a small number of students in the class who were actually familiar with Linux. That has changed dramatically. Now, a good number of people have some level of familiarity with Linux. The systems have got more complicated and the kinds of jobs they want to run often are on a much larger scale then they were familiar with so I think there is still a lot of hand holding required but it is at a slightly different level.”
Added Bailey, “In fact our scientists spend at least half of their time helping users to debug their jobs. We don’t do any collaborative research but when we see users who are having trouble we have staff that look at the way they are structuring their jobs, give them advice about how to set up a pipeline, and how best to optimize it for the system.”
Norris recognizes the challenge and said, “Biowulf 2.0 was really focused on traditional HPC capabilities and in submitting applications from the command line to a queue. We really have to broaden our services in support in this next phase for the less computationally sophisticated scientists.” Still, the current numbers aren’t bad. Roughly half of NIH researchers are making use of Biowulf – that’s well beyond the original 25 percent forecast given by BioTeam.
Introducing Biowulf 3.0….
It’s worth noting how biomedical workloads and the computational requirements have changed. Early genomics applications – sequence assembly, alignment, and variant calling – were more about embarrassingly parallel data processing than the traditional tightly-coupled computation of HPC modeling and simulation. Molecular modeling and systems biology used a mix of both. The rise of imaging (microscopy is just one example) and the need for accurate identification of images (think pathology reports) has proven ideal for machine learning.
Today there is a diversity of workloads which benefit from a variety of computational strengths. This is noted in NIH’s official description: “Biowulf is designed for large numbers of simultaneous jobs common in the biosciences, as well as large-scale distributed memory tasks such as molecular dynamics. A wide variety of scientific software is installed and maintained on Biowulf, along with scientific databases.”
Biomedical research computing is growing only more complex. A good example of this was the opening keynote The Algorithms of Life – Scientific Computing for Systems Biology, presented by Ivo Sbalzarini, The Algorithms of Life – Scientific Computing for Systems Biology (See HPCwire coverage of Sbalzarini’s talk).
“What will Biowulf 3.0 look like?” asked Baxevanis rhetorically. “Right now, the machine is a general purpose computing resource. We could certainly just make it bigger and people would be happy with that, but in the long run, that’s not the right way to go – it has to be both bigger and different at the same time. We are in the middle of a long-term strategic planning process, and part of that process involves evaluating new architectures and new technologies so that we can continue to meet the scientific needs of the intramural research program (IRP).
“We’re particularly focused on how the architecture should be structured so we can start doing much more in the realms of deep learning and artificial intelligence. Some of our most recent PI recruitments have brought in talented people in this field, mostly in the National Cancer Institute’s Center for Cancer Research, and we’re actively laying the groundwork to be able to have significant presence in this area. It’s something that we are admittedly new to so we are tiptoeing in, but it’s where we see the future of biomedical computing.”
The process will be similar to Biowulf 2.0, which is to lay out a compelling argument, plan, and proposed budget and to convince NIH leadership to fund the effort. It is also good to remember that Biowulf, though perhaps preeminent, is one of many NIH computational initiatives. Data intensive science rules them all and for the first time last year NIH laid out its data science strategy.
Said Norris, “With Biowulf 2.0, we really built out our capability incrementally. Year by year, we added and replaced old boards and added capability. Each year we did a big upgrade if you will. That may likely not be the approach to try to take in Biowulf 3.0. We may do a more consolidated every 2- or 3-year modernization as opposed to small incremental year by year,” she said. BioTeam is again doing the assessment.
It will be fun to watch.
Links to HPCwire articles on the state of HPC in life sciences in 2019:
Notes for Biowulf/HPC systems diagram
The Biowulf cluster is a 95,000+ core/30+ PB Linux cluster. Biowulf is designed for large numbers of simultaneous jobs common in the biosciences, as well as large-scale distributed memory tasks such as molecular dynamics. A wide variety of scientific software is installed and maintained on Biowulf, along with scientific databases. See our hardware page for more details. Any scientific computation should be run on cluster compute nodes as batch jobs or sinteractive sessions.. Compute nodes can access http and ftp sites outside our network via a proxy so that some data transfer jobs can be run on the cluster.
The login node (biowulf.nih.gov) is used to submit jobs to the cluster. Users connect to this system via ssh or NX. No compute intensive, data transfer or large file manipulation processes should be run on the login node. This system is for submitting jobs only.
Helix (helix.nih.gov) is the interactive data transfer and file management node for the NIH HPC Systems. Users should run all such processes (scp, sftp, Aspera transfers, rsync, wget/curl, large file compressions, etc.) on this system. Scientific applications are not available on Helix. Helix is a 48 core (4 X 3.00 GHz 12-core Xeon™ Gold 6136) system with 1.5 TB of main memory running RedHat Enterprise Linux 7 and has a direct connection to the internet.
The helixdrive service allows users on the NIH network to mount their home, data, and shared directories as mapped network drives on their local workstations.
Sciware is a ‘software on demand’ service that provides scientific software that runs on Windows, Mac and Linux desktops. Sciware is available to anyone with an HPC account. Software includes Matlab and Mathematica.
Helixweb is a set of web-based scientific tools.
Globus is a file transfer service that makes it easy to move, sync and share large amounts of data within the NIH as well as with other sites.
The http and ftp proxies allow users to fetch data from the internet on compute nodes with tools like wget, curl, and ftp.