Finally! SC19 Competitors Live and in Color!

By Dan Olds

December 10, 2019

You know the saying “better late than never”? That’s how my cluster competition coverage is faring this year. With SC19 coming late in November, quickly followed by my annual trip to South Africa to cover their cluster competition, I’ve been running behind. But I’m back and I’m going to provide all of the deep analysis and competition coverage that you’ve all become accustomed to over the years.

Now let’s take an up close and personal look at our SC19 teams. Using the miracle of video, we’ve interviewed as many teams as we could given the accessibility constraints. We apologize to the teams that we couldn’t get to, but we were under the gun to get as many teams as we could during our limited access time. We managed to snare 12 out of 16, which isn’t too bad, I guess, but far from our usual 100% coverage, damn it.

Team Washington:  Representing the great Pacific Northwest, we have Team Washington or Team Husky or Team Udub. This team is driving a slim configuration with two nodes, but they’re also packing eight NVIDIA V100 GPUs, so they have plenty of processing power. This is a team that can adapt on the fly, for example:  For some reason, teams have to have official data center racks for their cluster or else they’re disqualified. Back in the day, before we had all of these nitpicky rules, you used to be able to use about anything to hold your cluster. But today, you have to have an expensive rack to house your couple of nodes.

Anyway, the Udub students weren’t provided a rack by their sponsor and thus had to scramble to find one by Monday morning at 9:30 am or else face expulsion. They combed Craigslist and Facebook Marketplace and came up with a $100 42U rack. But it was in Boulder, not Dener. So they had to rent a truck, head to Boulder to pick it up, return the truck, and get it all set up by early Monday morning. Nice work, guys, great job.

Watch the video to see and hear more about the Washington team, both me and my cluster competition color commentator Jessi Lanum were highly impressed by this first-time team. Let’s see how they do.

Team Warsaw:  Jessi and I interview Team Warsaw to see how this now veteran team are handling the pressure of the SC19 cluster competition. The students from Warsaw have one of their best configurations with five nodes, eight GPUs, and a beefy Mellanox EDR interconnect. The team this year is very solid and experienced, with great skills. Could this be the year that Team Warsaw breaks out of the pack?

It’s also a closely-knit team. When we were interviewing them, one of their team members was off sleeping, so they showed her picture to the camera just to make sure that she was included in the video.

Wake Forest:  When Jessi and I check in on them, Wake Forest seems to be happy with their performance so far in the competition. They’ve established a good division of labor and are using their machine well. We run into an anomaly in the team, a finance major! Well, a finance and computer science major, but it’s the first one we’ve run into in ten years of covering competitions.

On the reproducibility challenge, the Daemon Deacons found that the paper is valid. One of the students on this app is like the most chilled out competitor we’ve seen. Kicked back, easy going, relaxed, he’s the picture of happiness, which is nice to see. Check out the video to check him out.

One of the team’s network cards went out, which is unfortunate. Under the rules, the team can’t do a restart without taking a penalty, which, to me, is sort of unfair when it’s a hardware problem that is clearly outside of student control. But rules are rules, right?

University of Illinois Urbana-Champaign:  Team UIUC is doing well when we catch up with them, with some caveats. They’re driving an older cluster that seems like it’s become a bit crotchety in its old age. As the team captain said to us, if they’re not on top of it all the time, it tends to get out of hand and overheat. To me, this sounds a bit like a nuclear pile back in the old days.

The team has two NVMe drives on each of their four nodes, plus a grand total of eight NVIDIA V100 GPUs. They’re also using IBM’s Spectrum Scale (formerly GPFS) file system and tossed out some love to IBM by mentioning it.

Check out the video to get details on their various challenges and how they got over them.

UIUC had a $700 Azure Cloud budget that they managed to blow through pretty quickly. When we talked to them, they only had $6 left in their budget. Jessi and I offered to toss in $10 each to help them get a little breathing room, but that’s against the rules. Plus, I didn’t have the sawbuck on me anyway, so it all worked out well.

Team Tennessee:  This team is an amalgamation of students from University of Tennessee, Maryville College and Phellissippi State Community College. These are all first time participants, so they have their work cut out for them. I give them a bit of grief over the unsuccessful Tennessee Volunteer football team, which was kind of fun.

While we’re interviewing the team, both Shanghai teams went over the power limit, causing sirens and lights to go off, which was also fun.

The team is realistic about their chances to take home the Championship Trophy (unfortunately, there is no real trophy). While they’re doing well, they know that it’s an uphill climb and that the most important thing about the competition is how much they’re learning. They hope to come back in subsequent years and mount another quest for cluster competition glory.

ETH Zurich:  This is the second outing for the Swiss team. Backed by the CSCS, this is a team that has proven they can compete with the top-tier competitors. How? In their first competition, they took home third place and the Highest LINPACK award at ISC19 – which is almost an unprecedented level of success for first timers. We hadn’t seen that kind of debut since the South African CHPC won the whole ISC shooting match in their first year back at ISC13.

The team is making good progress with the applications, with no apparent problems, when we find them on the competition floor. The stupid video is in and out of focus as the camera struggles to figure out where to focus.

During our conversation we discuss the differences between the ISC and SC competition. More rules at SC, plus plenty of sleep deprivation, which is a marked difference from ISC. One of the team members said that the SC competition was “more competitive” than the ISC competition, begging the question (which I asked) “how can you say it’s more competitive when you didn’t actually win the ISC19 competition?” Mean question? Yeah, it was, but I hadn’t slept much either.

The team had a bit of a letdown on their LINPACK score, which was slightly lower than their championship LINPACK at ISC, but there’s a good explanation for the discrepancy, check out the video for the details.

ShanghaiTech:  This is the third competition for a new university, ShanghaiTech. They were a powerful new competitor at ASC18, finishing in second place and punching their ticket for ISC18. They had a bit of a sophomore slump at ISC18, doing well, but not taking home any major prizes, although they were first in HPCG.

The first team member we interviewed talked about his past experience in FPGA design and claimed that his youth (he’s the youngest on the team) gives him an edge in productivity and creativity. The team has a solid complement of skills, ranging from traditional HPC drivers to computer architecture and AI specialists.

ShanghaiTech is pushing a large-ish cluster with six nodes and a whopping 16 NVIDIA V100 GPUs. That’s a whole hell of a lot of computing power, but it requires rigorous control and power throttling in order to keep it within the 3,000 watt limit. Can ShanghaiTech control this beast and get the most out of it? We’ll find out.

Purdue:  As an institution, Purdue has sponsored 14 cluster teams in worldwide major competitions. While they haven’t come home with any trophies, they’ve gained a lot of knowledge and have even built a curriculum around the events – which is a very good thing.

They’re running a system with very sporty AMD 32-core Rome processors arranged in five single-node systems. Unfortunately, their motherboards don’t support GPUs, which is a huge disadvantage in modern cluster competitions. It was unclear whether or not this configuration was intentional, thinking it could win, or if it was a technical oversight. But either way, they’re trying their best and giving it the old Purdue try – which is what you do when you’re in a cluster competition.

Team NTHU:  This is another team that has been around the block in Student Cluster Competitions, logging an astounding 17 major events over the last 12 years. They’ve amassed an enviable record of Gold Medals and LINPACK Awards, with their most recent win coming at ASC19 in Dalian, China.

They’re in a bit of trouble when we catch up to them. They have a GPU down and they can’t fix it due to cabling problems. They do have seven other GPUs, but that might not be enough for to get them over the hump.

However, like most all NTHU teams, they’ve done a great job in optimizing the apps and getting them to run. NTHU almost never submits a zero score, no matter what. In the video, I tell a story about how NTHU outwitted all the other teams during their New Orleans 2010 win – a story that is now referred to as the “Super Sort.” It’s good watching.

Nanyang Technological University:  Team Nanyang, the pride of Singapore, has become a top echelon team over the past few years and is always a threat to walk away with multiple trophies. They’re a pioneer in the “small is beautiful” cluster movement and are at it again with a two node, 16 GPU system. As we heard in the interview, the team has notched another LINPACK award. We’ll have details on that in our next story.

As we meet with the team, they’re coming down to the wire on turning in applications – but, as we note in the video, Nanyang has never not turned in a result, which is an incredible feat in modern competition history.

Team FAU:  This German team has had a long and storied history. They’ve won two LINPACK awards along with a Bronze Medal in their seven-year history. This year, they’re driving a NEC Aurora vector machine, which is a whole different deal for the team, who are used to driving more conventional clusters.

One of the problems they have is that their vector engines broke down during the benchmarking phase of the competition. They had to pull them from the cluster, which means they can only run on CPU power. This won’t give them enough processing power to compete with the other teams, unfortunately. But the plucky Germans are continuing to push and will certainly finish the competition and never give up. There just isn’t any quit in this team.

Shanghai Jiao Tong:  This is one of my favorite teams. Their coach was a long-time competitor for the school, and I must have interviewed him ten times over the years at multiple venues. He’s a hard charger, highly competitive, but more interested in what his team can take from the competition knowledge wise than taking home trophies.

Jessi and I catch up with Shanghai Jiao Tong and ask them about their competition so far. While Shanghai has had some hardware problems in the past, everything is running at 100% today. The team is driving one of the larger clusters in the competition with six nodes, eight V100 GPUs and some of the fastest CPUs in the competition at 2.6 GHz. To me, this team has been poised on the edge of moving into the top tier of cluster competition teams but hasn’t quite gotten into the groove yet. This could be their year.

Next up, we’re going to take an in depth look at the LINPACK and HPCG results, then reveal the detailed overall scoring. Following that, we’ll provide our patent pending “Power Ranking Analysis” which shows which teams are getting the most performance out of their systems. Stay tuned to this channel for all the latest. If you want to catch up on your Student Cluster Competition history, check out the new Student Cluster Competition website.

 

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Army Seeks AI Ground Truth

April 3, 2020

Deep neural networks are being mustered by U.S. military researchers to marshal new technology forces on the Internet of Battlefield Things. U.S. Army and industry researchers said this week they have developed a “c Read more…

By George Leopold

Piz Daint Tackles Marsquakes

April 3, 2020

Even as researchers use supercomputers to probe the mysteries of earthquakes here on Earth, others are setting their sights on quakes just a little farther away. Researchers at ETH Zürich in Switzerland have applied sup Read more…

By Oliver Peckham

HPC Career Notes: April 2020 Edition

April 2, 2020

In this monthly feature, we’ll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it’s a promotion, new company hire, or even an accolade, we’ Read more…

By Mariana Iriarte

AMD Epyc CPUs Now on Bare Metal IBM Cloud Servers

April 1, 2020

AMD’s expanding presence in the datacenter and cloud computing markets took a step forward with today’s announcement that its 7nm 2nd Gen Epyc 7642 CPUs are now available on IBM Cloud bare metal servers. AMD, whose Read more…

By Doug Black

Supercomputer Testing Probes Viral Transmission in Airplanes

April 1, 2020

It might be a long time before the general public is flying again, but the question remains: how high-risk is air travel in terms of viral infection? In an article for the Texas Advanced Computing Center (TACC), Faith Si Read more…

By Staff report

AWS Solution Channel

Amazon FSx for Lustre Update: Persistent Storage for Long-Term, High-Performance Workloads

Last year I wrote about Amazon FSx for Lustre and told you how our customers can use it to create pebibyte-scale, highly parallel POSIX-compliant file systems that serve thousands of simultaneous clients driving millions of IOPS (Input/Output Operations per Second) with sub-millisecond latency. Read more…

ECP Milestone Report Details Progress and Directions

April 1, 2020

The Exascale Computing Project (ECP) milestone report issued last week presents a good snapshot of progress in preparing applications for exascale computing. There are roughly 30 ECP application development (AD) subproj Read more…

By John Russell

ECP Milestone Report Details Progress and Directions

April 1, 2020

The Exascale Computing Project (ECP) milestone report issued last week presents a good snapshot of progress in preparing applications for exascale computing. Th Read more…

By John Russell

Pandemic ‘Wipes Out’ 2020 HPC Market Growth, Flat to 12% Drop Expected

March 31, 2020

As the world battles the still accelerating novel coronavirus, the HPC community has mounted a forceful response to the pandemic on many fronts. But these efforts won't inoculate the HPC industry from the economic effects of COVID-19. Market watcher Intersect360 Research has revised its 2020 forecast for HPC products and services, projecting... Read more…

By Tiffany Trader

LLNL Leverages Supercomputing to Identify COVID-19 Antibody Candidates

March 30, 2020

As COVID-19 sweeps the globe to devastating effect, supercomputers around the world are spinning up to fight back by working on diagnosis, epidemiology, treatme Read more…

By Staff report

Weather at Exascale: Load Balancing for Heterogeneous Systems

March 30, 2020

The first months of 2020 were dominated by weather and climate supercomputing news, with major announcements coming from the UK, the European Centre for Medium- Read more…

By Oliver Peckham

Q&A Part Two: ORNL’s Pooser on Progress in Quantum Communication

March 30, 2020

Quantum computing seems to get more than its fair share of attention compared to quantum communication. That’s despite the fact that quantum networking may be Read more…

By John Russell

DoE Expands on Role of COVID-19 Supercomputing Consortium

March 25, 2020

After announcing the launch of the COVID-19 High Performance Computing Consortium on Sunday, the Department of Energy yesterday provided more details on its sco Read more…

By John Russell

[email protected] Rallies a Legion of Computers Against the Coronavirus

March 24, 2020

Last week, we highlighted [email protected], a massive, crowdsourced computer network that has turned its resources against the coronavirus pandemic sweeping the globe – but [email protected] isn’t the only game in town. The internet is buzzing with crowdsourced computing... Read more…

By Oliver Peckham

Conversation: ANL’s Rick Stevens on DoE’s AI for Science Project

March 23, 2020

With release of the Department of Energy’s AI for Science report in late February, the effort to build a national AI program, modeled loosely on the U.S. Exascale Initiative, enters a new phase. Project leaders have already had early discussions with Congress... Read more…

By John Russell

[email protected] Turns Its Massive Crowdsourced Computer Network Against COVID-19

March 16, 2020

For gamers, fighting against a global crisis is usually pure fantasy – but now, it’s looking more like a reality. As supercomputers around the world spin up Read more…

By Oliver Peckham

Julia Programming’s Dramatic Rise in HPC and Elsewhere

January 14, 2020

Back in 2012 a paper by four computer scientists including Alan Edelman of MIT introduced Julia, A Fast Dynamic Language for Technical Computing. At the time, t Read more…

By John Russell

Global Supercomputing Is Mobilizing Against COVID-19

March 12, 2020

Tech has been taking some heavy losses from the coronavirus pandemic. Global supply chains have been disrupted, virtually every major tech conference taking place over the next few months has been canceled... Read more…

By Oliver Peckham

[email protected] Rallies a Legion of Computers Against the Coronavirus

March 24, 2020

Last week, we highlighted [email protected], a massive, crowdsourced computer network that has turned its resources against the coronavirus pandemic sweeping the globe – but [email protected] isn’t the only game in town. The internet is buzzing with crowdsourced computing... Read more…

By Oliver Peckham

DoE Expands on Role of COVID-19 Supercomputing Consortium

March 25, 2020

After announcing the launch of the COVID-19 High Performance Computing Consortium on Sunday, the Department of Energy yesterday provided more details on its sco Read more…

By John Russell

Steve Scott Lays Out HPE-Cray Blended Product Roadmap

March 11, 2020

Last week, the day before the El Capitan processor disclosures were made at HPE's new headquarters in San Jose, Steve Scott (CTO for HPC & AI at HPE, and former Cray CTO) was on-hand at the Rice Oil & Gas HPC conference in Houston. He was there to discuss the HPE-Cray transition and blended roadmap, as well as his favorite topic, Cray's eighth-gen networking technology, Slingshot. Read more…

By Tiffany Trader

Fujitsu A64FX Supercomputer to Be Deployed at Nagoya University This Summer

February 3, 2020

Japanese tech giant Fujitsu announced today that it will supply Nagoya University Information Technology Center with the first commercial supercomputer powered Read more…

By Tiffany Trader

Tech Conferences Are Being Canceled Due to Coronavirus

March 3, 2020

Several conferences scheduled to take place in the coming weeks, including Nvidia’s GPU Technology Conference (GTC) and the Strata Data + AI conference, have Read more…

By Alex Woodie

Leading Solution Providers

SC 2019 Virtual Booth Video Tour

AMD
AMD
ASROCK RACK
ASROCK RACK
AWS
AWS
CEJN
CJEN
CRAY
CRAY
DDN
DDN
DELL EMC
DELL EMC
IBM
IBM
MELLANOX
MELLANOX
ONE STOP SYSTEMS
ONE STOP SYSTEMS
PANASAS
PANASAS
SIX NINES IT
SIX NINES IT
VERNE GLOBAL
VERNE GLOBAL
WEKAIO
WEKAIO

Cray to Provide NOAA with Two AMD-Powered Supercomputers

February 24, 2020

The United States’ National Oceanic and Atmospheric Administration (NOAA) last week announced plans for a major refresh of its operational weather forecasting supercomputers, part of a 10-year, $505.2 million program, which will secure two HPE-Cray systems for NOAA’s National Weather Service to be fielded later this year and put into production in early 2022. Read more…

By Tiffany Trader

Exascale Watch: El Capitan Will Use AMD CPUs & GPUs to Reach 2 Exaflops

March 4, 2020

HPE and its collaborators reported today that El Capitan, the forthcoming exascale supercomputer to be sited at Lawrence Livermore National Laboratory and serve Read more…

By John Russell

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

IBM Unveils Latest Achievements in AI Hardware

December 13, 2019

“The increased capabilities of contemporary AI models provide unprecedented recognition accuracy, but often at the expense of larger computational and energet Read more…

By Oliver Peckham

TACC Supercomputers Run Simulations Illuminating COVID-19, DNA Replication

March 19, 2020

As supercomputers around the world spin up to combat the coronavirus, the Texas Advanced Computing Center (TACC) is announcing results that may help to illumina Read more…

By Staff report

IBM Debuts IC922 Power Server for AI Inferencing and Data Management

January 28, 2020

IBM today launched a Power9-based inference server – the IC922 – that features up to six Nvidia T4 GPUs, PCIe Gen 4 and OpenCAPI connectivity, and can accom Read more…

By John Russell

Summit Joins the Fight Against the Coronavirus

March 6, 2020

With the coronavirus sweeping the globe, tech conferences and supply chains are being hit hard – but now, tech is hitting back. Oak Ridge National Laboratory Read more…

By Staff report

University of Stuttgart Inaugurates ‘Hawk’ Supercomputer

February 20, 2020

This week, the new “Hawk” supercomputer was inaugurated in a ceremony at the High-Performance Computing Center of the University of Stuttgart (HLRS). Offici Read more…

By Staff report

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This