Finally! SC19 Competitors Live and in Color!

By Dan Olds

December 10, 2019

You know the saying “better late than never”? That’s how my cluster competition coverage is faring this year. With SC19 coming late in November, quickly followed by my annual trip to South Africa to cover their cluster competition, I’ve been running behind. But I’m back and I’m going to provide all of the deep analysis and competition coverage that you’ve all become accustomed to over the years.

Now let’s take an up close and personal look at our SC19 teams. Using the miracle of video, we’ve interviewed as many teams as we could given the accessibility constraints. We apologize to the teams that we couldn’t get to, but we were under the gun to get as many teams as we could during our limited access time. We managed to snare 12 out of 16, which isn’t too bad, I guess, but far from our usual 100% coverage, damn it.

Team Washington:  Representing the great Pacific Northwest, we have Team Washington or Team Husky or Team Udub. This team is driving a slim configuration with two nodes, but they’re also packing eight NVIDIA V100 GPUs, so they have plenty of processing power. This is a team that can adapt on the fly, for example:  For some reason, teams have to have official data center racks for their cluster or else they’re disqualified. Back in the day, before we had all of these nitpicky rules, you used to be able to use about anything to hold your cluster. But today, you have to have an expensive rack to house your couple of nodes.

Anyway, the Udub students weren’t provided a rack by their sponsor and thus had to scramble to find one by Monday morning at 9:30 am or else face expulsion. They combed Craigslist and Facebook Marketplace and came up with a $100 42U rack. But it was in Boulder, not Dener. So they had to rent a truck, head to Boulder to pick it up, return the truck, and get it all set up by early Monday morning. Nice work, guys, great job.

Watch the video to see and hear more about the Washington team, both me and my cluster competition color commentator Jessi Lanum were highly impressed by this first-time team. Let’s see how they do.

Team Warsaw:  Jessi and I interview Team Warsaw to see how this now veteran team are handling the pressure of the SC19 cluster competition. The students from Warsaw have one of their best configurations with five nodes, eight GPUs, and a beefy Mellanox EDR interconnect. The team this year is very solid and experienced, with great skills. Could this be the year that Team Warsaw breaks out of the pack?

It’s also a closely-knit team. When we were interviewing them, one of their team members was off sleeping, so they showed her picture to the camera just to make sure that she was included in the video.

Wake Forest:  When Jessi and I check in on them, Wake Forest seems to be happy with their performance so far in the competition. They’ve established a good division of labor and are using their machine well. We run into an anomaly in the team, a finance major! Well, a finance and computer science major, but it’s the first one we’ve run into in ten years of covering competitions.

On the reproducibility challenge, the Daemon Deacons found that the paper is valid. One of the students on this app is like the most chilled out competitor we’ve seen. Kicked back, easy going, relaxed, he’s the picture of happiness, which is nice to see. Check out the video to check him out.

One of the team’s network cards went out, which is unfortunate. Under the rules, the team can’t do a restart without taking a penalty, which, to me, is sort of unfair when it’s a hardware problem that is clearly outside of student control. But rules are rules, right?

University of Illinois Urbana-Champaign:  Team UIUC is doing well when we catch up with them, with some caveats. They’re driving an older cluster that seems like it’s become a bit crotchety in its old age. As the team captain said to us, if they’re not on top of it all the time, it tends to get out of hand and overheat. To me, this sounds a bit like a nuclear pile back in the old days.

The team has two NVMe drives on each of their four nodes, plus a grand total of eight NVIDIA V100 GPUs. They’re also using IBM’s Spectrum Scale (formerly GPFS) file system and tossed out some love to IBM by mentioning it.

Check out the video to get details on their various challenges and how they got over them.

UIUC had a $700 Azure Cloud budget that they managed to blow through pretty quickly. When we talked to them, they only had $6 left in their budget. Jessi and I offered to toss in $10 each to help them get a little breathing room, but that’s against the rules. Plus, I didn’t have the sawbuck on me anyway, so it all worked out well.

Team Tennessee:  This team is an amalgamation of students from University of Tennessee, Maryville College and Phellissippi State Community College. These are all first time participants, so they have their work cut out for them. I give them a bit of grief over the unsuccessful Tennessee Volunteer football team, which was kind of fun.

While we’re interviewing the team, both Shanghai teams went over the power limit, causing sirens and lights to go off, which was also fun.

The team is realistic about their chances to take home the Championship Trophy (unfortunately, there is no real trophy). While they’re doing well, they know that it’s an uphill climb and that the most important thing about the competition is how much they’re learning. They hope to come back in subsequent years and mount another quest for cluster competition glory.

ETH Zurich:  This is the second outing for the Swiss team. Backed by the CSCS, this is a team that has proven they can compete with the top-tier competitors. How? In their first competition, they took home third place and the Highest LINPACK award at ISC19 – which is almost an unprecedented level of success for first timers. We hadn’t seen that kind of debut since the South African CHPC won the whole ISC shooting match in their first year back at ISC13.

The team is making good progress with the applications, with no apparent problems, when we find them on the competition floor. The stupid video is in and out of focus as the camera struggles to figure out where to focus.

During our conversation we discuss the differences between the ISC and SC competition. More rules at SC, plus plenty of sleep deprivation, which is a marked difference from ISC. One of the team members said that the SC competition was “more competitive” than the ISC competition, begging the question (which I asked) “how can you say it’s more competitive when you didn’t actually win the ISC19 competition?” Mean question? Yeah, it was, but I hadn’t slept much either.

The team had a bit of a letdown on their LINPACK score, which was slightly lower than their championship LINPACK at ISC, but there’s a good explanation for the discrepancy, check out the video for the details.

ShanghaiTech:  This is the third competition for a new university, ShanghaiTech. They were a powerful new competitor at ASC18, finishing in second place and punching their ticket for ISC18. They had a bit of a sophomore slump at ISC18, doing well, but not taking home any major prizes, although they were first in HPCG.

The first team member we interviewed talked about his past experience in FPGA design and claimed that his youth (he’s the youngest on the team) gives him an edge in productivity and creativity. The team has a solid complement of skills, ranging from traditional HPC drivers to computer architecture and AI specialists.

ShanghaiTech is pushing a large-ish cluster with six nodes and a whopping 16 NVIDIA V100 GPUs. That’s a whole hell of a lot of computing power, but it requires rigorous control and power throttling in order to keep it within the 3,000 watt limit. Can ShanghaiTech control this beast and get the most out of it? We’ll find out.

Purdue:  As an institution, Purdue has sponsored 14 cluster teams in worldwide major competitions. While they haven’t come home with any trophies, they’ve gained a lot of knowledge and have even built a curriculum around the events – which is a very good thing.

They’re running a system with very sporty AMD 32-core Rome processors arranged in five single-node systems. Unfortunately, their motherboards don’t support GPUs, which is a huge disadvantage in modern cluster competitions. It was unclear whether or not this configuration was intentional, thinking it could win, or if it was a technical oversight. But either way, they’re trying their best and giving it the old Purdue try – which is what you do when you’re in a cluster competition.

Team NTHU:  This is another team that has been around the block in Student Cluster Competitions, logging an astounding 17 major events over the last 12 years. They’ve amassed an enviable record of Gold Medals and LINPACK Awards, with their most recent win coming at ASC19 in Dalian, China.

They’re in a bit of trouble when we catch up to them. They have a GPU down and they can’t fix it due to cabling problems. They do have seven other GPUs, but that might not be enough for to get them over the hump.

However, like most all NTHU teams, they’ve done a great job in optimizing the apps and getting them to run. NTHU almost never submits a zero score, no matter what. In the video, I tell a story about how NTHU outwitted all the other teams during their New Orleans 2010 win – a story that is now referred to as the “Super Sort.” It’s good watching.

Nanyang Technological University:  Team Nanyang, the pride of Singapore, has become a top echelon team over the past few years and is always a threat to walk away with multiple trophies. They’re a pioneer in the “small is beautiful” cluster movement and are at it again with a two node, 16 GPU system. As we heard in the interview, the team has notched another LINPACK award. We’ll have details on that in our next story.

As we meet with the team, they’re coming down to the wire on turning in applications – but, as we note in the video, Nanyang has never not turned in a result, which is an incredible feat in modern competition history.

Team FAU:  This German team has had a long and storied history. They’ve won two LINPACK awards along with a Bronze Medal in their seven-year history. This year, they’re driving a NEC Aurora vector machine, which is a whole different deal for the team, who are used to driving more conventional clusters.

One of the problems they have is that their vector engines broke down during the benchmarking phase of the competition. They had to pull them from the cluster, which means they can only run on CPU power. This won’t give them enough processing power to compete with the other teams, unfortunately. But the plucky Germans are continuing to push and will certainly finish the competition and never give up. There just isn’t any quit in this team.

Shanghai Jiao Tong:  This is one of my favorite teams. Their coach was a long-time competitor for the school, and I must have interviewed him ten times over the years at multiple venues. He’s a hard charger, highly competitive, but more interested in what his team can take from the competition knowledge wise than taking home trophies.

Jessi and I catch up with Shanghai Jiao Tong and ask them about their competition so far. While Shanghai has had some hardware problems in the past, everything is running at 100% today. The team is driving one of the larger clusters in the competition with six nodes, eight V100 GPUs and some of the fastest CPUs in the competition at 2.6 GHz. To me, this team has been poised on the edge of moving into the top tier of cluster competition teams but hasn’t quite gotten into the groove yet. This could be their year.

Next up, we’re going to take an in depth look at the LINPACK and HPCG results, then reveal the detailed overall scoring. Following that, we’ll provide our patent pending “Power Ranking Analysis” which shows which teams are getting the most performance out of their systems. Stay tuned to this channel for all the latest. If you want to catch up on your Student Cluster Competition history, check out the new Student Cluster Competition website.

 

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

2024 Winter Classic: The Return of Team Fayetteville

April 18, 2024

Hailing from Fayetteville, NC, Fayetteville State University stayed under the radar in their first Winter Classic competition in 2022. Solid students for sure, but not a lot of HPC experience. All good. They didn’t Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use of Rigetti’s Novera 9-qubit QPU. The approach by a quantum Read more…

2024 Winter Classic: Meet Team Morehouse

April 17, 2024

Morehouse College? The university is well-known for their long list of illustrious graduates, the rigor of their academics, and the quality of the instruction. They were one of the first schools to sign up for the Winter Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pressing needs and hurdles to widespread AI adoption. The sudde Read more…

Quantinuum Reports 99.9% 2-Qubit Gate Fidelity, Caps Eventful 2 Months

April 16, 2024

March and April have been good months for Quantinuum, which today released a blog announcing the ion trap quantum computer specialist has achieved a 99.9% (three nines) two-qubit gate fidelity on its H1 system. The lates Read more…

Mystery Solved: Intel’s Former HPC Chief Now Running Software Engineering Group 

April 15, 2024

Last year, Jeff McVeigh, Intel's readily available leader of the high-performance computing group, suddenly went silent, with no interviews granted or appearances at press conferences.  It led to questions -- what's Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Computational Chemistry Needs To Be Sustainable, Too

April 8, 2024

A diverse group of computational chemists is encouraging the research community to embrace a sustainable software ecosystem. That's the message behind a recent Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Intel’s Xeon General Manager Talks about Server Chips 

January 2, 2024

Intel is talking data-center growth and is done digging graves for its dead enterprise products, including GPUs, storage, and networking products, which fell to Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire