Less than 10 months after it was announced, the Columbus-based Ohio Supercomputer Center (OSC) has debuted its Dell-built GPU cluster, “Ascend.” Designed to help meet the ever-growing need for GPUs in the OSC user community, Ascend is a small system equipped with around a hundred Nvidia GPUs and weighs in around 2 peak petaflops, complementing OSC’s more traditional HPC systems.
Ascend is a Dell PowerEdge system with 24 nodes. Each node has dual AMD Epyc “Milan” 7643 CPUs (for a total of 48); quadruple Nvidia A100 (80GB) GPUs (for a total of 96); and 921GB of “usable” memory (for a total of ~22TB). The system uses Nvidia InfiniBand (200Gb/s) networking and is estimated at 1.95 peak petaflops; OSC hasn’t submitted Linpack benchmarks since the launch of its Owens system in 2016. Speaking of Owens, both it and the more recent Pitzer system (2018) are still in play at OSC: both are Dell-built and powered by Intel and Nvidia hardware, each consist of hundreds of nodes, and OSC says that they total around 5.5 peak petaflops.
OSC is presenting Ascend as a complementary, immediate solution for its user community. While Ascend might represent a ~36% increase in terms of peak petaflops, the center says that the new cluster (the center stops short of calling it a supercomputer) triples the center’s capacity for AI, modeling and simulation.
“OSC developed Ascend in response to discussions with our client community, stakeholders and vendors, who identified an immediate need for greater GPU resources to process research and simulations that rely on AI, big data and machine learning,” said David Hudak, OSC’s executive director. “We are pleased to be able to offer this major new resource to the HPC community and support client advancements in academic research and commercial technologies.”
One early access user, OSU assistant professor Yu Su, reported positive results, running one of the largest neural network models – BLOOM-176B – on OSU hardware for the first time and highlighting student experiences of 2× to 3× faster processing compared to the older hardware. One graduate student, Bargeen Turzo, reported drastic differences in protein prediction: “For some large proteins I was not even able to get a single prediction on Pitzer after running the calculation for multiple weeks,” Turzo said. “While on Ascend the same calculation finished in 12 hours.”
Ascend was installed last fall, with OSC testing the system from October through December with other early access users like Su. “Part of the goal of the early-user period was to get a better understanding of how the user applications make use of the GPUs that we are supporting in the system,” explained Doug Johnson, associate director of OSC. “We will continue to improve the software and management of the system as we learn more from what we encounter supporting the early users and operating the system for a longer period of time.”
“OSC’s client services and scientific applications teams will be available to help our clients determine if their applications can make good use of the Ascend GPUs,” Johnson added. “For some applications there is a large performance benefit for using the GPUs and Ascend will make it possible for our clients to tackle some problems that can’t be solved on our current systems.”
OSC also shared a peek at its future plans with a note that it intends to replace its Owens cluster (at seven years old, an old-timer by HPC standards) this year, running the two systems concurrently for a period before phasing out Owens.
Last month, OSC celebrated its 35-year anniversary. “It’s our job to constantly be on the cutting edge of technology, evaluating and deploying it, and making it available here in Ohio,” Hudak said on that occasion. “Beyond simply providing the technology, though, we also make it uniquely flexible, affordable and easy to access thanks to 35 years of experience pushing Ohio’s capabilities forward.”