Rise of NIH’s Biowulf Mirrors the Rise of Computational Biology

By John Russell

July 29, 2019

The story of NIH’s supercomputer Biowulf is fascinating, important, and in many ways representative of the transformation of life sciences and biomedical research into a hybrid discipline that is dependent upon advanced computational power and prolific data-generating instruments. When named in 1999 – yes this is Biowulf’s 20th birthday – it was a small cluster of 40 “boxes on shelves,” running CHARMm and BLAST, with 14 active users. A few papers (9) cited the HPC resource that year. So much for immediate impact.

Today Biowulf is a roughly 2 petaflops, general purpose HPC resource, with ~100,000 cores on diverse nodes (thin, thick, accelerated), 35 petabytes of storage, a high speed, InfiniBand-dominated 100 gig network, supporting 3000 active users. In 2018, it ran more than 34 million jobs and delivered more than a billion CPU hours and is on pace to match that this year. In 2019 additional SSDs have been deployed, allowing up to 2.4 TB of local scratch disk allocations, and the 2500th paper citing Biowulf was published. That’s impact.

Noteworthy, Biowulf cracked the Top500 in 2016 at #156 and rose to #66 in 2017 (the last time it ran Linpack) and remains on the list. (See HPC systems figure below; explanatory notes are at the end of article.)

It’s been a wild ride. Unlike the big machines at national labs which tend to sprout, enjoy a period of prominence, and then topple (decommission), Biowulf has become a living resource, evolving with the times. The latest chapter, Biowulf 2.0, comes to an end this summer in the sense of completing the most recent $70 million, five-year modernization. Today, Biowulf is the fastest supercomputer in the world that is solely designed for and dedicated to biomedical research. (You won’t be surprised to learn that Biowulf 3.0 planning has already begun.)

Let’s not overlook that HPC arrived at NIH reasonably early but without great fanfare or wide use. In 1986 NIH brought in a Cray X-MP/22, which was at the time the world’s fastest supercomputer. It had two processors that could be addressed by a single program and was used by a very few researchers, mostly at the National Cancer Institute (NCI), to study molecular structure and to do some image processing. Afterward, advanced computer infrastructure growth at NIH was irregular and somewhat modest including Biowulf’s beginnings in 1999.

Biowulf Servers

The real fireworks started in 2013/2014 when the Biowulf 2.0 project was conceived and undertaken, not long after Andrea Norris joined NIH as the director at Center for Information Technology and CIO at NIH. “It was clear we were at the beginning of the data tsunami that was affecting biomedical research,” she recalled.

Sequencing the human genome (3.2 billion base pairs), completed circa 2001, is the watershed event most people point to when discussing biomedical research’s transformation into a digital science. HPC, writ broadly, was at least as important as the high-volume DNA sequencing machines from Applied Biosystems in accomplishing that goal. The AB sequencers sliced up the genome into small DNA fragments, amplified them, sequenced them, and read out the myriad sequenced fragments. Big computers sorted the fragments and stitched them together into the proper human genome. We’ll leave aside wrangling between public and private (Celera Genomics) efforts to finish the job. Then U.S. president Bill Clinton sort of declared the rough draft finished in 2001 and the feuding parties had little choice but to agree.

Without doubt the plummeting cost of sequencing technology and its rapid adoption jump-started the wide-spread use of advanced computing in life sciences, but many experimental life sciences technologies were also percolating at the same time. Chemical biology and molecular modeling (mostly scoring docking probabilities for assessing leads) had been kicking around for years and were becoming more sophisticated. An endless number of ‘omics’ – genomics, proteomics, metabolomics, are just the big three – was popping up. A variety of advanced microscopy technologies based on improved imaging, data mining, and most recently machine learning recently burst onto the scene. Systems biology, which attempts to integrate many of the new digital pieces of biology into useful simulation and prediction tools, was bubbling.

You get the idea. Lots of things were happening at once (and in IT as well). As Norris notes there was a growing avalanche of data spilling from new instruments that only advanced computers could manage and make sense of. Before plunging into technology choices NIH made for Biowulf 2.0, consider two examples to set the context of the times: one from the early “what’s a computer” days; and another from a “we better jump on board” moment which capture a compressed version of NIH’s perspective.

  1. Framingham Heart Study – As Vital Today as 70 Years Ago

The 70-year-old Framingham Heart Study, begun in 1948, is unique in the world and still going strong. Today it encompasses three generations of participants and is now led by Dr. Daniel Levy, senior investigator, at the National Heart, Lung, and Blood Institute (NHLBI).

Daniel Levy, NHLBI

“It’s quite a story of the evolution of the size and complexity of the data we have collected and the complexity of the types of analyses that have been conducted. When the study first began in the 1940s there were no computers available for the analysis of data. The data would be collected, much of it was punched into old IBM punch cards, and if you wanted to identify how many diabetics there were among the original 5000 participants you had to take those IBM punch cards and put them through a card sorter that would allow you to make a determination of the prevalence of diabetes in your study sample…To do something like find the mean value of mean blood pressure among study participants, they had an old adding machine and that was used to calculate things like means. This is a relatively primitive method,” said Levy.

Contrast that with what Levy and colleagues are now doing. “We published a paper in Nature Communications about a year ago. We looked at 71 cardiovascular disease protein biomarkers in 7000 Framingham participants and then we conducted genome wide association studies (GWAS) of each of those 71 proteins using millions of genetic variants on each individual, relating them each to each protein. A decade ago it might have been possible to do such an analysis for a single protein but daunting to think of doing it across 71 proteins. Today we are able to apply this kind of brute force analysis across 71 proteins that helped us identify genetic signals for circulating levels of these proteins and then link the genetic information on these proteins to identify proteins that may serve as causal biomarkers of cardiovascular risk.”

The Framingham Heart Study is indeed a rich resource and living instrument whose computational requirements have mushroomed just as they have throughout biomedical research and more recently in the clinic. Today, among other things, the Framingham Study is readying to tackle analysis of whole genome sequencing for thousands of individuals and those datasets will be immense.

  1. Where Have All the Applicants Gone

So while requirements for DNA sequencing and genomics research broadly comprised the point of the spear of biology’s thrust into computation, to some extent they also acted a bit like the canary in a coal mine for NIH with regard to identifying NIH’s declining attraction for top talent.

“Rolling the clock back to 2013, we came to a sober realization that we weren’t keeping up with the needs of the intramural program in the HPC space,” explained Dr. Andy Baxevanis, director of computational biology for the NIH Intramural Research Program and senior scientist at the National Human Genome Research Institute (NHGRI). “We did not have nearly enough horsepower to meet the demand of all of our investigators and we were starting to see the effects of that. Not acting would have had a devastating effect on our research program, and we were already seeing how we were starting to fall behind our peer institutions (government and academic) in that there was a measurable adverse effect on our ability to recruit and retain the best and brightest.”

NIH responded with the Biowulf 2.0 project, a five-phase project, with phases loosely corresponding to the years of implementation. Here’s Andrea Norris on the effort:

Andrea Norris, NIH CIO

“Our objectives over the five-year period were to put in a modern architecture that had both the power and flexibility to meet the needs of intramural researchers across our 27 institutes and centers and a wide variety of different disease and health domains and across basic, translational and clinical research. [We wanted] to promote data sharing and scientific collaboration by having this resource centrally located on campus with 100 gigabit network connectivity to and from the labs and out through the internet. [We also wanted to provide] common application support, and now support a suite of more than 600 commonly-used applications and tools shared by all of our researchers and provide ample high availability storage.”

The Biowulf 2.0 project, launched in 2014, was intended to put NIH’s computational capabilities on equal footing (or better) with its peer institutions. Then as now, emphasizes Norris, science requirements drove HPC design and architecture choices.

“As a CIO and as a service provider, I keep an eye on the fringe technologies but what we [deploy] has got to be something that there’s demand for, that’s practical, and can be supported and sustained. But we are, for example, in the Exascale Computing Initiative and our role has been in giving them the requirements for the kind of research we would love to be able to do that we cannot,” said Norris. Following an extensive needs assessment by consulting firm, BioTeam, NIH worked with a systems integrator (Initially Computer Sciences Corp., which through M&As, is now General Dynamics Information Technology (GDIT)) to deploy Biowulf 2.0 over the next few years.

Moving Life (Sciences) into the Fast Lane

Interestingly the first item of business was improving the network. Data movement was a painful pinch point as more research groups brought in new lab equipment and generated more data. In conjunction with the Biowulf upgrade, NIH undertook an extensive network modernization, which Norris calls the linchpin enabling Biowulf.

The first step for Biowulf 2.0 was to upgrade existing Ethernet network, then 1-to-10 gig, up to 40 gigs. In 2016 the decision was made to go to InfiniBand to get to 100 Gbps.

“That necessitated getting gateways installed between the Ethernet and IB fabrics and that actually turned out to be a little more challenging than we first thought,” said Steve Fellini, lead technologist, high performance computing (HPC) at NIH Center for Information Technology. “Since phase two we have been expanding the IB fabric. With Phase five we will have reached capacity on the current IB fabric and we’ll be building out an aggregation layer above the current core switches. That’ll be an HDR-based fabric.”

Not surprisingly, the last mile is an issue for NIH (as it is for others.) “While NIH funded connections, fast connection to particular buildings, after that it is the responsibility of the individual institute to build out the network to the actual lab and that has been relatively slow going. So some of our users have better connectivity than others. We very much encourage people to use Globus with which we have had good luck,” said Fellini.

Norris noted, “We now have a 100 gig, very large, distributed, state of the art network to 100-plus labs and facilities here on campus and near campus. At the start of the project you couldn’t even track how data are moving through it. Now, we are moving about 6 petabytes of data a day and watching that increase each year increase. While we have an incredibly powerful NIH network, we still do struggle a bit with that last mile, so up to the workstation or the piece of scientific equipment that’s literally sitting in the lab.”

With Data Intensive Science Comes Lots of Data

Unsurprisingly, adding storage capacity was critical. A single cryo-em microscope, for example, can generate 5TB of data a day, and scientists are well-known for bringing in new instruments without sufficient regard for needed IT support. Indeed it’s a common, perhaps unavoidable refrain that IT refreshes can’t keep pace with life sciences instrument refreshes. In any case, the original Biowulf 2.0 goal was to get to 14 petabytes of storage; it has far exceeded (35 PB) that goal.

Steve Bailey, chief, HPC, NIH Center for Information Technology

“We used to be surprised when we got requests for 100 GB of additional storage; now typically we’ll get requests for 10-20 TB. That’s not unusual and certainly guided our decision making as we were figuring out what we do next,” said Steve Bailey, chief, high performance computing (HPC) at the NIH Center for Information Technology.

Fellini added, “Each compute node has an SSD and we’ll often ask users to move as much I/O as possible to that scratch storage in order not to overload our network-based shared storage. Data to be retained can then be copied to shared space at job completion.”

DDN and NetAPP have been primary storage technology suppliers. GPFS (now IBM Spectrum Scale) has been the parallel files system of choice.

“We made the decision to go with GPFS over Lustre some number of years ago. There are no thoughts of switching to Lustre,” said Susan Chacko, lead scientist, high performance computing (HPC) at the NIH Center for Information Technology, “There some questions about DDN and the relationship with IBM GPFS. We are tracking what changes there might be to licensing for GPFS.”

Looking ahead she said, “We are very much interested in solid state technology and have in house a small Vast Data cluster for storage data and so we’re evaluating that. Actually, we’re benchmarking the vast cluster as we speak. Soon we’ll be evaluating a DDN SFA18K as well.”

Norris takes a long-term view: “Data storage is a challenge for us, given the vast amounts of data that we’re using. NIH-wide archival and long-term storage strategies and approaches are much needed. This is an area we are going to spend more attention in this next Biowulf 3.0.”

Building the Core Compute in Every Way but Exotic

Expanding the core compute capacity throughout Biowulf’s history was an incremental process driven by pressing science needs with more nodes added yearly. In 2002, for example, 198 nodes including 24 nodes with 24 GB of memory were added. CPU’s were all x86-based (Intel and AMD). Sixteen pilot GPU nodes were added in 2010, by which time the total Biowulf core count was up to 9000. NIH provides an excellent Biowulf history timeline online that is fun to ramble through with click-throughs to points of interest.

Most of the big changes occur during the Biowulf 2.0 project. Thirty thousand cores were added in 2015. Another thirty thousand cores were added in 2016, including a bunch of K80 GPU nodes, and support for HPC container technology, Singularity. In 2017, 48 P100 Nvidia GPU nodes, each with 4 P100s, were added. Eight V100 nodes, again with four GPUs on each, were added in 2018.

The CIT HPC team does review emerging technologies, but again, tends to focus on what readily available and proven.

“In fact for phase 5, while we’ve been using Intel based chips for the last four or five years, this year we took a look at the AMD EPYC chip. While we were impressed with its performance, for various reasons we couldn’t get the packaging the way we needed it. So we expect in the next year or so it will be a viable alternative to Intel,” said Bailey.

IBM’s Power chip line is not seen as a likely option at this time according to Bailey: “Susan has a group of scientists to support with over 600 apps and having a mixed architecture say between (IBM) Power and Intel would not be very viable solution at this point just by the sheer number of applications that would need to be recompiled.”

“We have finished most of our interviews and requirements gathering and benchmarks. Now we are starting to do the analysis to sort through what are going the recommendations.”

Training – If You Build It, Will They Come?

Building a powerful HPC resource is one thing. Helping biomedical researchers, many of whom have limited computational expertise or training, make effective use of the resource is another. While IT expertise levels among researchers is changing, it remains a mixed bag:

  • Still plenty of computer novices…Baxevanis noted many researchers, “don’t code and have never taken a computer science course, but they know that this is a resource they should be using to advance their research projects. To close this knowledge gap, the [CIT HPC team] had been offering in-person training sessions, but the classes were selling out so quickly they couldn’t keep up with demand, so the Biowulf team developed an online Introduction to Biowulf series, allowing more people to quickly come up-to-speed on using the HPC resources available to them. In addition, a very cool thing that the Biowulf team does is offer ‘coffee shop consults’ that are sprinkled around the Bethesda campus, where [CIT HPC team members] just hang out with our scientists who come with their questions and start banging out solutions right there on their laptops.”
  • …But the number of HPC savvy ones is up. Levy added, “I can tell you that over the course of the last 8 years or so there’s been a dramatic evolution in the kinds of research I am doing, the kinds of researchers I am hiring as post-doctoral fellows and staff scientists. Many of the researchers on my team now are computational biologists, bioinformatics experts, systems biology researchers and we are dependent upon the computing resources.”

Chacko offers a balanced view: “I think [the situation] has changed significantly over the last 15 years. We used to think that workshops every two months was enough. 15 years ago, we would get a small number of students in the class who were actually familiar with Linux. That has changed dramatically. Now, a good number of people have some level of familiarity with Linux. The systems have got more complicated and the kinds of jobs they want to run often are on a much larger scale then they were familiar with so I think there is still a lot of hand holding required but it is at a slightly different level.”

Added Bailey, “In fact our scientists spend at least half of their time helping users to debug their jobs. We don’t do any collaborative research but when we see users who are having trouble we have staff that look at the way they are structuring their jobs, give them advice about how to set up a pipeline, and how best to optimize it for the system.”

Norris recognizes the challenge and said, “Biowulf 2.0 was really focused on traditional HPC capabilities and in submitting applications from the command line to a queue. We really have to broaden our services in support in this next phase for the less computationally sophisticated scientists.” Still, the current numbers aren’t bad. Roughly half of NIH researchers are making use of Biowulf – that’s well beyond the original 25 percent forecast given by BioTeam.

Introducing Biowulf 3.0….

It’s worth noting how biomedical workloads and the computational requirements have changed. Early genomics applications – sequence assembly, alignment, and variant calling – were more about embarrassingly parallel data processing than the traditional tightly-coupled computation of HPC modeling and simulation. Molecular modeling and systems biology used a mix of both. The rise of imaging (microscopy is just one example) and the need for accurate identification of images (think pathology reports) has proven ideal for machine learning.

Today there is a diversity of workloads which benefit from a variety of computational strengths. This is noted in NIH’s official description: “Biowulf is designed for large numbers of simultaneous jobs common in the biosciences, as well as large-scale distributed memory tasks such as molecular dynamics. A wide variety of scientific software is installed and maintained on Biowulf, along with scientific databases.”

Biomedical research computing is growing only more complex. A good example of this was the opening keynote The Algorithms of Life – Scientific Computing for Systems Biology, presented by Ivo Sbalzarini, The Algorithms of Life – Scientific Computing for Systems Biology (See HPCwire coverage of Sbalzarini’s talk).

Andy Baxevanis, director of computational biology for the NIH Intramural Research Program and senior scientist at the National Human Genome Research Institute (NHGRI)

“What will Biowulf 3.0 look like?” asked Baxevanis rhetorically. “Right now, the machine is a general purpose computing resource. We could certainly just make it bigger and people would be happy with that, but in the long run, that’s not the right way to go – it has to be both bigger and different at the same time. We are in the middle of a long-term strategic planning process, and part of that process involves evaluating new architectures and new technologies so that we can continue to meet the scientific needs of the intramural research program (IRP).

“We’re particularly focused on how the architecture should be structured so we can start doing much more in the realms of deep learning and artificial intelligence. Some of our most recent PI recruitments have brought in talented people in this field, mostly in the National Cancer Institute’s Center for Cancer Research, and we’re actively laying the groundwork to be able to have significant presence in this area. It’s something that we are admittedly new to so we are tiptoeing in, but it’s where we see the future of biomedical computing.”

The process will be similar to Biowulf 2.0, which is to lay out a compelling argument, plan, and proposed budget and to convince NIH leadership to fund the effort. It is also good to remember that Biowulf, though perhaps preeminent, is one of many NIH computational initiatives. Data intensive science rules them all and for the first time last year NIH laid out its data science strategy.

Said Norris, “With Biowulf 2.0, we really built out our capability incrementally. Year by year, we added and replaced old boards and added capability. Each year we did a big upgrade if you will. That may likely not be the approach to try to take in Biowulf 3.0. We may do a more consolidated every 2- or 3-year modernization as opposed to small incremental year by year,” she said. BioTeam is again doing the assessment.

It will be fun to watch.

 

Links to HPCwire articles on the state of HPC in life sciences in 2019:

1) HPC in Life Sciences Part 1: CPU Choices, Rise of Data Lakes, Networking Challenges, and More

2) HPC in Life Sciences Part 2: Penetrating AI’s Hype and the Cloud’s Haze

Notes for Biowulf/HPC systems diagram

Biowulf cluster
The Biowulf cluster is a 95,000+ core/30+ PB Linux cluster. Biowulf is designed for large numbers of simultaneous jobs common in the biosciences, as well as large-scale distributed memory tasks such as molecular dynamics. A wide variety of scientific software is installed and maintained on Biowulf, along with scientific databases. See our hardware page for more details. Any scientific computation should be run on cluster compute nodes as batch jobs or sinteractive sessions.. Compute nodes can access http and ftp sites outside our network via a proxy so that some data transfer jobs can be run on the cluster.

Login node
The login node (biowulf.nih.gov) is used to submit jobs to the cluster. Users connect to this system via ssh or NX. No compute intensive, data transfer or large file manipulation processes should be run on the login node. This system is for submitting jobs only.

Helix
Helix (helix.nih.gov) is the interactive data transfer and file management node for the NIH HPC Systems. Users should run all such processes (scp, sftp, Aspera transfers, rsync, wget/curl, large file compressions, etc.) on this system. Scientific applications are not available on Helix. Helix is a 48 core (4 X 3.00 GHz 12-core Xeon™ Gold 6136) system with 1.5 TB of main memory running RedHat Enterprise Linux 7 and has a direct connection to the internet.

Helixdrive
The helixdrive service allows users on the NIH network to mount their home, data, and shared directories as mapped network drives on their local workstations.

Sciware
Sciware is a ‘software on demand’ service that provides scientific software that runs on Windows, Mac and Linux desktops. Sciware is available to anyone with an HPC account. Software includes Matlab and Mathematica.

Helixweb
Helixweb is a set of web-based scientific tools.

Globus
Globus is a file transfer service that makes it easy to move, sync and share large amounts of data within the NIH as well as with other sites.

Proxy
The http and ftp proxies allow users to fetch data from the internet on compute nodes with tools like wget, curl, and ftp.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Supercomputer Analysis Shows the Atmospheric Reach of the Tonga Eruption

January 21, 2022

On Saturday, an enormous eruption on the volcanic islands of Hunga Tonga and Hunga Haʻapai shook the Pacific Ocean. The explosion, which could be heard six thousand miles away in Alaska, caused tsunamis across the entir Read more…

NSB Issues US State of Science and Engineering 2022 Report

January 20, 2022

This week the National Science Board released its biannual U.S. State of Science and Engineering 2022 report, as required by the NSF Act. Broadly, the report presents a near-term view of S&E based mostly on 2019 data. To a large extent, this year’s edition echoes trends from the last few reports. The U.S. is still a world leader in R&D spending and S&E education... Read more…

Researchers Achieve 99 Percent Quantum Accuracy with Silicon-Embedded Qubits 

January 20, 2022

Researchers in Australia and the U.S. have made exciting headway in the quantum computing arms race. A multi-institutional team including the University of New South Wales and Sandia National Laboratory announced that th Read more…

Trio of Supercomputers Powers Estimate of Carbon in Earth’s Outer Core

January 20, 2022

Carbon is one of the essential building blocks of life on Earth, and it—along with hydrogen, nitrogen and oxygen—is one of the key elements researchers look for when they search for habitable planets and work to unde Read more…

Multiverse Targets ‘Quantum Computing for the Masses’

January 19, 2022

The race to deliver quantum computing solutions that shield users from the underlying complexity of quantum computing is heating up quickly. One example is Multiverse Computing, a European company, which today launched the second financial services product in its Singularity product group. The new offering, Fair Price, “delivers a higher accuracy in fair price calculations for financial... Read more…

AWS Solution Channel

shutterstock 718231072

Accelerating drug discovery with Amazon EC2 Spot Instances

This post was contributed by Cristian Măgherușan-Stanciu, Sr. Specialist Solution Architect, EC2 Spot, with contributions from Cristian Kniep, Sr. Developer Advocate for HPC and AWS Batch at AWS, Carlos Manzanedo Rueda, Principal Solutions Architect, EC2 Spot at AWS, Ludvig Nordstrom, Principal Solutions Architect at AWS, Vytautas Gapsys, project group leader at the Max Planck Institute for Biophysical Chemistry, and Carsten Kutzner, staff scientist at the Max Planck Institute for Biophysical Chemistry. Read more…

Students at SC21: Out in Front, Alongside and Behind the Scenes

January 19, 2022

The Supercomputing Conference (SC) is one of the biggest international conferences dedicated to high-performance computing, networking, storage and analysis. SC21 was a true ‘hybrid’ conference, with a total of 380 o Read more…

Supercomputer Analysis Shows the Atmospheric Reach of the Tonga Eruption

January 21, 2022

On Saturday, an enormous eruption on the volcanic islands of Hunga Tonga and Hunga Haʻapai shook the Pacific Ocean. The explosion, which could be heard six tho Read more…

NSB Issues US State of Science and Engineering 2022 Report

January 20, 2022

This week the National Science Board released its biannual U.S. State of Science and Engineering 2022 report, as required by the NSF Act. Broadly, the report presents a near-term view of S&E based mostly on 2019 data. To a large extent, this year’s edition echoes trends from the last few reports. The U.S. is still a world leader in R&D spending and S&E education... Read more…

Multiverse Targets ‘Quantum Computing for the Masses’

January 19, 2022

The race to deliver quantum computing solutions that shield users from the underlying complexity of quantum computing is heating up quickly. One example is Multiverse Computing, a European company, which today launched the second financial services product in its Singularity product group. The new offering, Fair Price, “delivers a higher accuracy in fair price calculations for financial... Read more…

Students at SC21: Out in Front, Alongside and Behind the Scenes

January 19, 2022

The Supercomputing Conference (SC) is one of the biggest international conferences dedicated to high-performance computing, networking, storage and analysis. SC Read more…

Q-Ctrl – Tackling Quantum Hardware’s Noise Problems with Software

January 13, 2022

Implementing effective error mitigation and correction is a critical next step in advancing quantum computing. While a lot of attention has been given to effort Read more…

Nvidia Defends Arm Acquisition Deal: a ‘Once-in-a-Generation Opportunity’

January 13, 2022

GPU-maker Nvidia is continuing to try to keep its proposed acquisition of British chip IP vendor Arm Ltd. alive, despite continuing concerns from several governments around the world. In its latest action, Nvidia filed a 29-page response to the U.K. government to point out a list of potential benefits of the proposed $40 billion deal. Read more…

Nvidia Buys HPC Cluster Management Company Bright Computing

January 10, 2022

Graphics chip powerhouse Nvidia today announced that it has acquired HPC cluster management company Bright Computing for an undisclosed sum. Unlike Nvidia’s bid to purchase semiconductor IP company Arm, which has been stymied by regulatory challenges, the Bright deal is a straightforward acquisition that aims to expand... Read more…

SC21 Panel on Programming Models – Tackling Data Movement, DSLs, More

January 6, 2022

How will programming future systems differ from current practice? This is an ever-present question in computing. Yet it has, perhaps, never been more pressing g Read more…

IonQ Is First Quantum Startup to Go Public; Will It be First to Deliver Profits?

November 3, 2021

On October 1 of this year, IonQ became the first pure-play quantum computing start-up to go public. At this writing, the stock (NYSE: IONQ) was around $15 and its market capitalization was roughly $2.89 billion. Co-founder and chief scientist Chris Monroe says it was fun to have a few of the company’s roughly 100 employees travel to New York to ring the opening bell of the New York Stock... Read more…

US Closes in on Exascale: Frontier Installation Is Underway

September 29, 2021

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, held by Zoom this week (Sept. 29-30), it was revealed that the Frontier supercomputer is currently being installed at Oak Ridge National Laboratory in Oak Ridge, Tenn. The staff at the Oak Ridge Leadership... Read more…

AMD Launches Milan-X CPU with 3D V-Cache and Multichip Instinct MI200 GPU

November 8, 2021

At a virtual event this morning, AMD CEO Lisa Su unveiled the company’s latest and much-anticipated server products: the new Milan-X CPU, which leverages AMD’s new 3D V-Cache technology; and its new Instinct MI200 GPU, which provides up to 220 compute units across two Infinity Fabric-connected dies, delivering an astounding 47.9 peak double-precision teraflops. “We're in a high-performance computing megacycle, driven by the growing need to deploy additional compute performance... Read more…

Intel Reorgs HPC Group, Creates Two ‘Super Compute’ Groups

October 15, 2021

Following on changes made in June that moved Intel’s HPC unit out of the Data Platform Group and into the newly created Accelerated Computing Systems and Graphics (AXG) business unit, led by Raja Koduri, Intel is making further updates to the HPC group and announcing... Read more…

Nvidia Buys HPC Cluster Management Company Bright Computing

January 10, 2022

Graphics chip powerhouse Nvidia today announced that it has acquired HPC cluster management company Bright Computing for an undisclosed sum. Unlike Nvidia’s bid to purchase semiconductor IP company Arm, which has been stymied by regulatory challenges, the Bright deal is a straightforward acquisition that aims to expand... Read more…

D-Wave Embraces Gate-Based Quantum Computing; Charts Path Forward

October 21, 2021

Earlier this month D-Wave Systems, the quantum computing pioneer that has long championed quantum annealing-based quantum computing (and sometimes taken heat fo Read more…

Killer Instinct: AMD’s Multi-Chip MI200 GPU Readies for a Major Global Debut

October 21, 2021

AMD’s next-generation supercomputer GPU is on its way – and by all appearances, it’s about to make a name for itself. The AMD Radeon Instinct MI200 GPU (a successor to the MI100) will, over the next year, begin to power three massive systems on three continents: the United States’ exascale Frontier system; the European Union’s pre-exascale LUMI system; and Australia’s petascale Setonix system. Read more…

Three Chinese Exascale Systems Detailed at SC21: Two Operational and One Delayed

November 24, 2021

Details about two previously rumored Chinese exascale systems came to light during last week’s SC21 proceedings. Asked about these systems during the Top500 media briefing on Monday, Nov. 15, list author and co-founder Jack Dongarra indicated he was aware of some very impressive results, but withheld comment when asked directly if he had... Read more…

Leading Solution Providers

Contributors

Lessons from LLVM: An SC21 Fireside Chat with Chris Lattner

December 27, 2021

Today, the LLVM compiler infrastructure world is essentially inescapable in HPC. But back in the 2000 timeframe, LLVM (low level virtual machine) was just getting its start as a new way of thinking about how to overcome shortcomings in the Java Virtual Machine. At the time, Chris Lattner was a graduate student of... Read more…

2021 Gordon Bell Prize Goes to Exascale-Powered Quantum Supremacy Challenge

November 18, 2021

Today at the hybrid virtual/in-person SC21 conference, the organizers announced the winners of the 2021 ACM Gordon Bell Prize: a team of Chinese researchers leveraging the new exascale Sunway system to simulate quantum circuits. The Gordon Bell Prize, which comes with an award of $10,000 courtesy of HPC pioneer Gordon Bell, is awarded annually... Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

Three Universities Team for NSF-Funded ‘ACES’ Reconfigurable Supercomputer Prototype

September 23, 2021

As Moore’s law slows, HPC developers are increasingly looking for speed gains in specialized code and specialized hardware – but this specialization, in turn, can make testing and deploying code trickier than ever. Now, researchers from Texas A&M University, the University of Illinois at Urbana... Read more…

Top500: No Exascale, Fugaku Still Reigns, Polaris Debuts at #12

November 15, 2021

No exascale for you* -- at least, not within the High-Performance Linpack (HPL) territory of the latest Top500 list, issued today from the 33rd annual Supercomputing Conference (SC21), held in-person in St. Louis, Mo., and virtually, from Nov. 14–19. "We were hoping to have the first exascale system on this list but that didn’t happen," said Top500 co-author... Read more…

TACC Unveils Lonestar6 Supercomputer

November 1, 2021

The Texas Advanced Computing Center (TACC) is unveiling its latest supercomputer: Lonestar6, a three peak petaflops Dell system aimed at supporting researchers Read more…

Nvidia Defends Arm Acquisition Deal: a ‘Once-in-a-Generation Opportunity’

January 13, 2022

GPU-maker Nvidia is continuing to try to keep its proposed acquisition of British chip IP vendor Arm Ltd. alive, despite continuing concerns from several governments around the world. In its latest action, Nvidia filed a 29-page response to the U.K. government to point out a list of potential benefits of the proposed $40 billion deal. Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire