Finally! SC19 Competitors Live and in Color!

By Dan Olds

December 10, 2019

You know the saying “better late than never”? That’s how my cluster competition coverage is faring this year. With SC19 coming late in November, quickly followed by my annual trip to South Africa to cover their cluster competition, I’ve been running behind. But I’m back and I’m going to provide all of the deep analysis and competition coverage that you’ve all become accustomed to over the years.

Now let’s take an up close and personal look at our SC19 teams. Using the miracle of video, we’ve interviewed as many teams as we could given the accessibility constraints. We apologize to the teams that we couldn’t get to, but we were under the gun to get as many teams as we could during our limited access time. We managed to snare 12 out of 16, which isn’t too bad, I guess, but far from our usual 100% coverage, damn it.

Team Washington:  Representing the great Pacific Northwest, we have Team Washington or Team Husky or Team Udub. This team is driving a slim configuration with two nodes, but they’re also packing eight NVIDIA V100 GPUs, so they have plenty of processing power. This is a team that can adapt on the fly, for example:  For some reason, teams have to have official data center racks for their cluster or else they’re disqualified. Back in the day, before we had all of these nitpicky rules, you used to be able to use about anything to hold your cluster. But today, you have to have an expensive rack to house your couple of nodes.

Anyway, the Udub students weren’t provided a rack by their sponsor and thus had to scramble to find one by Monday morning at 9:30 am or else face expulsion. They combed Craigslist and Facebook Marketplace and came up with a $100 42U rack. But it was in Boulder, not Dener. So they had to rent a truck, head to Boulder to pick it up, return the truck, and get it all set up by early Monday morning. Nice work, guys, great job.

https://youtu.be/kmr_PeH2RZo

Watch the video to see and hear more about the Washington team, both me and my cluster competition color commentator Jessi Lanum were highly impressed by this first-time team. Let’s see how they do.

Team Warsaw:  Jessi and I interview Team Warsaw to see how this now veteran team are handling the pressure of the SC19 cluster competition. The students from Warsaw have one of their best configurations with five nodes, eight GPUs, and a beefy Mellanox EDR interconnect. The team this year is very solid and experienced, with great skills. Could this be the year that Team Warsaw breaks out of the pack?

It’s also a closely-knit team. When we were interviewing them, one of their team members was off sleeping, so they showed her picture to the camera just to make sure that she was included in the video.

Wake Forest:  When Jessi and I check in on them, Wake Forest seems to be happy with their performance so far in the competition. They’ve established a good division of labor and are using their machine well. We run into an anomaly in the team, a finance major! Well, a finance and computer science major, but it’s the first one we’ve run into in ten years of covering competitions.

On the reproducibility challenge, the Daemon Deacons found that the paper is valid. One of the students on this app is like the most chilled out competitor we’ve seen. Kicked back, easy going, relaxed, he’s the picture of happiness, which is nice to see. Check out the video to check him out.

One of the team’s network cards went out, which is unfortunate. Under the rules, the team can’t do a restart without taking a penalty, which, to me, is sort of unfair when it’s a hardware problem that is clearly outside of student control. But rules are rules, right?

University of Illinois Urbana-Champaign:  Team UIUC is doing well when we catch up with them, with some caveats. They’re driving an older cluster that seems like it’s become a bit crotchety in its old age. As the team captain said to us, if they’re not on top of it all the time, it tends to get out of hand and overheat. To me, this sounds a bit like a nuclear pile back in the old days.

The team has two NVMe drives on each of their four nodes, plus a grand total of eight NVIDIA V100 GPUs. They’re also using IBM’s Spectrum Scale (formerly GPFS) file system and tossed out some love to IBM by mentioning it.

Check out the video to get details on their various challenges and how they got over them.

UIUC had a $700 Azure Cloud budget that they managed to blow through pretty quickly. When we talked to them, they only had $6 left in their budget. Jessi and I offered to toss in $10 each to help them get a little breathing room, but that’s against the rules. Plus, I didn’t have the sawbuck on me anyway, so it all worked out well.

Team Tennessee:  This team is an amalgamation of students from University of Tennessee, Maryville College and Phellissippi State Community College. These are all first time participants, so they have their work cut out for them. I give them a bit of grief over the unsuccessful Tennessee Volunteer football team, which was kind of fun.

While we’re interviewing the team, both Shanghai teams went over the power limit, causing sirens and lights to go off, which was also fun.

The team is realistic about their chances to take home the Championship Trophy (unfortunately, there is no real trophy). While they’re doing well, they know that it’s an uphill climb and that the most important thing about the competition is how much they’re learning. They hope to come back in subsequent years and mount another quest for cluster competition glory.

ETH Zurich:  This is the second outing for the Swiss team. Backed by the CSCS, this is a team that has proven they can compete with the top-tier competitors. How? In their first competition, they took home third place and the Highest LINPACK award at ISC19 – which is almost an unprecedented level of success for first timers. We hadn’t seen that kind of debut since the South African CHPC won the whole ISC shooting match in their first year back at ISC13.

The team is making good progress with the applications, with no apparent problems, when we find them on the competition floor. The stupid video is in and out of focus as the camera struggles to figure out where to focus.

During our conversation we discuss the differences between the ISC and SC competition. More rules at SC, plus plenty of sleep deprivation, which is a marked difference from ISC. One of the team members said that the SC competition was “more competitive” than the ISC competition, begging the question (which I asked) “how can you say it’s more competitive when you didn’t actually win the ISC19 competition?” Mean question? Yeah, it was, but I hadn’t slept much either.

The team had a bit of a letdown on their LINPACK score, which was slightly lower than their championship LINPACK at ISC, but there’s a good explanation for the discrepancy, check out the video for the details.

ShanghaiTech:  This is the third competition for a new university, ShanghaiTech. They were a powerful new competitor at ASC18, finishing in second place and punching their ticket for ISC18. They had a bit of a sophomore slump at ISC18, doing well, but not taking home any major prizes, although they were first in HPCG.

The first team member we interviewed talked about his past experience in FPGA design and claimed that his youth (he’s the youngest on the team) gives him an edge in productivity and creativity. The team has a solid complement of skills, ranging from traditional HPC drivers to computer architecture and AI specialists.

ShanghaiTech is pushing a large-ish cluster with six nodes and a whopping 16 NVIDIA V100 GPUs. That’s a whole hell of a lot of computing power, but it requires rigorous control and power throttling in order to keep it within the 3,000 watt limit. Can ShanghaiTech control this beast and get the most out of it? We’ll find out.

Purdue:  As an institution, Purdue has sponsored 14 cluster teams in worldwide major competitions. While they haven’t come home with any trophies, they’ve gained a lot of knowledge and have even built a curriculum around the events – which is a very good thing.

They’re running a system with very sporty AMD 32-core Rome processors arranged in five single-node systems. Unfortunately, their motherboards don’t support GPUs, which is a huge disadvantage in modern cluster competitions. It was unclear whether or not this configuration was intentional, thinking it could win, or if it was a technical oversight. But either way, they’re trying their best and giving it the old Purdue try – which is what you do when you’re in a cluster competition.

Team NTHU:  This is another team that has been around the block in Student Cluster Competitions, logging an astounding 17 major events over the last 12 years. They’ve amassed an enviable record of Gold Medals and LINPACK Awards, with their most recent win coming at ASC19 in Dalian, China.

They’re in a bit of trouble when we catch up to them. They have a GPU down and they can’t fix it due to cabling problems. They do have seven other GPUs, but that might not be enough for to get them over the hump.

However, like most all NTHU teams, they’ve done a great job in optimizing the apps and getting them to run. NTHU almost never submits a zero score, no matter what. In the video, I tell a story about how NTHU outwitted all the other teams during their New Orleans 2010 win – a story that is now referred to as the “Super Sort.” It’s good watching.

Nanyang Technological University:  Team Nanyang, the pride of Singapore, has become a top echelon team over the past few years and is always a threat to walk away with multiple trophies. They’re a pioneer in the “small is beautiful” cluster movement and are at it again with a two node, 16 GPU system. As we heard in the interview, the team has notched another LINPACK award. We’ll have details on that in our next story.

As we meet with the team, they’re coming down to the wire on turning in applications – but, as we note in the video, Nanyang has never not turned in a result, which is an incredible feat in modern competition history.

Team FAU:  This German team has had a long and storied history. They’ve won two LINPACK awards along with a Bronze Medal in their seven-year history. This year, they’re driving a NEC Aurora vector machine, which is a whole different deal for the team, who are used to driving more conventional clusters.

One of the problems they have is that their vector engines broke down during the benchmarking phase of the competition. They had to pull them from the cluster, which means they can only run on CPU power. This won’t give them enough processing power to compete with the other teams, unfortunately. But the plucky Germans are continuing to push and will certainly finish the competition and never give up. There just isn’t any quit in this team.

Shanghai Jiao Tong:  This is one of my favorite teams. Their coach was a long-time competitor for the school, and I must have interviewed him ten times over the years at multiple venues. He’s a hard charger, highly competitive, but more interested in what his team can take from the competition knowledge wise than taking home trophies.

Jessi and I catch up with Shanghai Jiao Tong and ask them about their competition so far. While Shanghai has had some hardware problems in the past, everything is running at 100% today. The team is driving one of the larger clusters in the competition with six nodes, eight V100 GPUs and some of the fastest CPUs in the competition at 2.6 GHz. To me, this team has been poised on the edge of moving into the top tier of cluster competition teams but hasn’t quite gotten into the groove yet. This could be their year.

Next up, we’re going to take an in depth look at the LINPACK and HPCG results, then reveal the detailed overall scoring. Following that, we’ll provide our patent pending “Power Ranking Analysis” which shows which teams are getting the most performance out of their systems. Stay tuned to this channel for all the latest. If you want to catch up on your Student Cluster Competition history, check out the new Student Cluster Competition website.

 

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

NIST/Xanadu Researchers Report Photonic Quantum Computing Advance

March 3, 2021

Researchers from the National Institute of Standards and Technology (NIST) and Xanadu, a young Canada-based quantum computing company, have reported developing a full-stack, photonic quantum computer able to carry out th Read more…

By John Russell

Can Deep Learning Replace Numerical Weather Prediction?

March 3, 2021

Numerical weather prediction (NWP) is a mainstay of supercomputing. Some of the first applications of the first supercomputers dealt with climate modeling, and even to this day, the largest climate models are heavily con Read more…

By Oliver Peckham

Deloitte Outfits New AI Computing Center with Nvidia DGX Gear

March 3, 2021

With AI use continuing to grow in adoption throughout enterprise IT, Deloitte is creating a new Deloitte Center for AI Computing to advise its customers, explain the technology and help them use it in their ongoing busin Read more…

By Todd R. Weiss

HPE Names Justin Hotard New HPC Chief as Pete Ungaro Departs

March 2, 2021

HPE CEO Antonio Neri announced today (March 2, 2020) the appointment of Justin Hotard as general manager of HPC, mission critical solutions and labs, effective immediately. Hotard replaces long-time Cray exec Pete Ungaro Read more…

By Tiffany Trader

ORNL’s Jeffrey Vetter on How IRIS Runtime will Help Deal with Extreme Heterogeneity

March 2, 2021

Jeffery Vetter is a familiar figure in HPC. Last year he became one of the new section heads in a reorganization at Oak Ridge National Laboratory. He had been founding director of ORNL's Future Technologies Group which i Read more…

By John Russell

AWS Solution Channel

Moderna Accelerates COVID-19 Vaccine Development on AWS

Marcello Damiani, Chief Digital and Operational Excellence Officer at Moderna, joins Todd Weatherby, Vice President of AWS Professional Services Worldwide, for a discussion on developing Moderna’s COVID-19 vaccine, scaling systems to enable global distribution, and leveraging cloud technologies to accelerate processes. Read more…

HPC Career Notes: March 2021 Edition

March 1, 2021

In this monthly feature, we’ll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it’s a promotion, new company hire, or even an accolade, we’ Read more…

By Mariana Iriarte

Can Deep Learning Replace Numerical Weather Prediction?

March 3, 2021

Numerical weather prediction (NWP) is a mainstay of supercomputing. Some of the first applications of the first supercomputers dealt with climate modeling, and Read more…

By Oliver Peckham

HPE Names Justin Hotard New HPC Chief as Pete Ungaro Departs

March 2, 2021

HPE CEO Antonio Neri announced today (March 2, 2020) the appointment of Justin Hotard as general manager of HPC, mission critical solutions and labs, effective Read more…

By Tiffany Trader

ORNL’s Jeffrey Vetter on How IRIS Runtime will Help Deal with Extreme Heterogeneity

March 2, 2021

Jeffery Vetter is a familiar figure in HPC. Last year he became one of the new section heads in a reorganization at Oak Ridge National Laboratory. He had been f Read more…

By John Russell

HPC Career Notes: March 2021 Edition

March 1, 2021

In this monthly feature, we’ll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it Read more…

By Mariana Iriarte

African Supercomputing Center Inaugurates ‘Toubkal,’ Most Powerful Supercomputer on the Continent

February 25, 2021

Historically, Africa hasn’t exactly been synonymous with supercomputing. There are only a handful of supercomputers on the continent, with few ranking on the Read more…

By Oliver Peckham

Japan to Debut Integrated Fujitsu HPC/AI Supercomputer This Spring

February 25, 2021

The integrated Fujitsu HPC/AI Supercomputer, Wisteria, is coming to Japan this spring. The University of Tokyo is preparing to deploy a heterogeneous computing Read more…

By Tiffany Trader

Xilinx Launches Alveo SN1000 SmartNIC

February 24, 2021

FPGA vendor Xilinx has debuted its latest SmartNIC model, the Alveo SN1000, with integrated “composability” features that allow enterprise users to add their own custom networking functions to supplement its built-in networking. By providing deep flexibility... Read more…

By Todd R. Weiss

ASF Keynotes Showcase How HPC and Big Data Have Pervaded the Pandemic

February 24, 2021

Last Thursday, a range of experts joined the Advanced Scale Forum (ASF) in a rapid-fire roundtable to discuss how advanced technologies have transformed the way humanity responded to the COVID-19 pandemic in indelible ways. The roundtable, held near the one-year mark of the first... Read more…

By Oliver Peckham

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

By John Russell

Esperanto Unveils ML Chip with Nearly 1,100 RISC-V Cores

December 8, 2020

At the RISC-V Summit today, Art Swift, CEO of Esperanto Technologies, announced a new, RISC-V based chip aimed at machine learning and containing nearly 1,100 low-power cores based on the open-source RISC-V architecture. Esperanto Technologies, headquartered in... Read more…

By Oliver Peckham

Azure Scaled to Record 86,400 Cores for Molecular Dynamics

November 20, 2020

A new record for HPC scaling on the public cloud has been achieved on Microsoft Azure. Led by Dr. Jer-Ming Chia, the cloud provider partnered with the Beckman I Read more…

By Oliver Peckham

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

By Tracey Bryant

NICS Unleashes ‘Kraken’ Supercomputer

April 4, 2008

A Cray XT4 supercomputer, dubbed Kraken, is scheduled to come online in mid-summer at the National Institute for Computational Sciences (NICS). The soon-to-be petascale system, and the resulting NICS organization, are the result of an NSF Track II award of $65 million to the University of Tennessee and its partners to provide next-generation supercomputing for the nation's science community. Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

By Doug Black

Top500: Fugaku Keeps Crown, Nvidia’s Selene Climbs to #5

November 16, 2020

With the publication of the 56th Top500 list today from SC20's virtual proceedings, Japan's Fugaku supercomputer – now fully deployed – notches another win, Read more…

By Tiffany Trader

Gordon Bell Special Prize Goes to Massive SARS-CoV-2 Simulations

November 19, 2020

2020 has proven a harrowing year – but it has produced remarkable heroes. To that end, this year, the Association for Computing Machinery (ACM) introduced the Read more…

By Oliver Peckham

Leading Solution Providers

Contributors

Texas A&M Announces Flagship ‘Grace’ Supercomputer

November 9, 2020

Texas A&M University has announced its next flagship system: Grace. The new supercomputer, named for legendary programming pioneer Grace Hopper, is replacing the Ada system (itself named for mathematician Ada Lovelace) as the primary workhorse for Texas A&M’s High Performance Research Computing (HPRC). Read more…

By Oliver Peckham

Saudi Aramco Unveils Dammam 7, Its New Top Ten Supercomputer

January 21, 2021

By revenue, oil and gas giant Saudi Aramco is one of the largest companies in the world, and it has historically employed commensurate amounts of supercomputing Read more…

By Oliver Peckham

Intel Xe-HP GPU Deployed for Aurora Exascale Development

November 17, 2020

At SC20, Intel announced that it is making its Xe-HP high performance discrete GPUs available to early access developers. Notably, the new chips have been deplo Read more…

By Tiffany Trader

Intel Teases Ice Lake-SP, Shows Competitive Benchmarking

November 17, 2020

At SC20 this week, Intel teased its forthcoming third-generation Xeon "Ice Lake-SP" server processor, claiming competitive benchmarking results against AMD's second-generation Epyc "Rome" processor. Ice Lake-SP, Intel's first server processor with 10nm technology... Read more…

By Tiffany Trader

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

By John Russell

Livermore’s El Capitan Supercomputer to Debut HPE ‘Rabbit’ Near Node Local Storage

February 18, 2021

A near node local storage innovation called Rabbit factored heavily into Lawrence Livermore National Laboratory’s decision to select Cray’s proposal for its CORAL-2 machine, the lab’s first exascale-class supercomputer, El Capitan. Details of this new storage technology were revealed... Read more…

By Tiffany Trader

African Supercomputing Center Inaugurates ‘Toubkal,’ Most Powerful Supercomputer on the Continent

February 25, 2021

Historically, Africa hasn’t exactly been synonymous with supercomputing. There are only a handful of supercomputers on the continent, with few ranking on the Read more…

By Oliver Peckham

It’s Fugaku vs. COVID-19: How the World’s Top Supercomputer Is Shaping Our New Normal

November 9, 2020

Fugaku is currently the most powerful publicly ranked supercomputer in the world – but we weren’t supposed to have it yet. The supercomputer, situated at Japan’s Riken scientific research institute, was scheduled to come online in 2021. When the pandemic struck... Read more…

By Oliver Peckham

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire