Making the Team South Africa: Defending the Crown

By Dan Olds

June 15, 2020

As you read this article, 82 university students from 11 countries are working feverishly on a cluster located at the National Supercomputing Centre of Singapore to try to win the ISC 2020 Student Cluster Competition golden crown. Ok, there isn’t an actual golden crown, but there are trophies, including a big one for the Overall Champion.

One of these teams is from the Centre for High Performance Computing located in South Africa. This is their seventh appearance in the ISC cluster wars and they’ve built up an incredible record of four gold medals, two silver medals and a bronze. In other words, they have made the podium every single time they’ve competed.

This achievement is all the more impressive because each of their teams is a unique set of undergrads – no repeats allowed. Some teams have the same students appearing in every competition until they lose their eligibility and go pro. Not the case with South Africa, it’s one and done for them. Former team members mentor new members but can’t compete more than once in the big dance.

Little Dance Then Big Dance

The CHPC is the only organization that has a ‘play in’ round to select their ISC team. Early in the competition year, the word goes out to universities all over South Africa:  Put together your cluster teams. It’s go time.

The organization provides training materials and classes to help prepare the HPC beginners to compete at the CHPC HPC forum that occurs every December. At the forum, ten student cluster teams from various universities gather to duke it out to see who will be selected for the national team.

I had the privilege of attending the 2019 CHPC cluster competition and cover the three student competitions that took place:  the cluster competition, the cyber-security competition, and the AI competition. In this article, I’m going to take you through the cluster competition in detail.

Ten Teams – One Winner

Each team is composed of four undergraduate students. They are assisted by mentors from past CHPC cluster competition teams, which is very cool. The overall winning team will form the foundation of the national team, with two outstanding competitors from the non-winning teams and then two alternates.

Through the miracle of video and extra airline luggage fees to haul the equipment to Johannesburg, South Africa, I was able to interview each of the teams twice, once to meet them, then again towards the end of the competition as a check in. Let’s take a look…

Team Alt F4:  Named after the shut down command, this team is looking to shut down the other competitors. When we first check in on the team, they’re doing well, but are already tired when we reach them on the second day. This is one of those teams where everyone does everything without a lot of specialization.

When we check back in on the team, it’s a bit of a different story. When asked how they were doing, the mood was definitely different – they were in crunch time. They’ve been having problems compiling some of the applications, which is typical for these competitions.

Team It’s Spelt Bolognese:  this team has one of the more unusual names in the competition, a real head scratcher for me. So that’s of course, my first question for them. Explanation? Watch the video to see.

The team is driving a three-node cluster with a switch that is supposedly on the way but hasn’t arrived yet. (As it turns out, none of the teams get their switches in time, so they all go with point to point interconnects – old school, love it.)  The whole team is from Cape Town, so Johannesburg is, according to them, a real treat. When we check in with the team on the last day, they’re struggling to get some results to submit. Like some of the other teams, it’s the compilers that are the issue – trying to find the right compiler for each app. This is, as we’ve seen, a common story and one that we’ll hear again.

Team Ketamine:  Ketamine is a horse tranquilizer which kind of goes with the motif of their booth. It’s a tranquil place with mood lighting and a laid-back style. When we catch up to the team early on, their three-node cluster is working well and the team is working on getting their benchmarks compiled.

According to the team, it’s “vibe first, Germany second” meaning that their mood is more important than winning and getting the coveted trip to Frankfurt for the ISC finals. They have a ‘different concept’ about what winning should mean in this context. To them, having a great time with their friends while at the CHPC conference is the ultimate win. We get into a bit of a dispute about how well this attitude will serve them in the big picture. I can’t tell if they’re just yanking me or being serious, although the team says they are serious. Check out the video and see what I mean.

Team Send Nodes:  Send Nodes is learning the fine art of building switchless interconnects as we catch up with them on the first day. They’re soldiering through and getting the hang of it. The team is running what seems to be the standard three-node configuration with each node being a compute node – no need to have a dedicated head node in clusters this small, right?

The team has appointed a “Compiler Tsar” who is responsible for finding and selecting just the right compiler for the job – sort of like a HPC sous chef. When we interview the team on the last day, we find them busily putting the finishing touches on their applications and trying to get the best results possible. They’re still getting plenty of error messages, some of them unique to their team, which is a bit troubling. While they’ve gotten to the point where they get to use the NVIDIA V100 GPU nodes in the cloud, they’re having trouble getting Quantum ESPRESSO to compile so that they can run it on the cloudy infrastructure.

Team Vision 404:  Another interesting name. Combining “file not found” with “vision”, could be interpreted as a bad thing. The team sees it as hopeful, although I’m not sure why. Team 404 hasn’t really divided up their work to a great degree, but on further questioning, it seems like one guy is responsible for most of the applications/benchmarks. The team also has a ‘Designated Google Guy’, a surfer dude who does all of the team research and provides answers back to the other students. Good division of labor.

On the last day of the competition, Vision 404 is fired up. They’re tired, sure, but they know this is the time to drive hard. As we comment “don’t hate the player, hate the game”, so at this point they’re resigned to competing against themselves and for posterity. Great attitude, love their passion and drive to learn.

Team SomberSystem:  Kind of a sad name that was picked out of the blue by the team. They’re not all that somber, which is a good thing. Their system is three huge workstations connected by a point to point interconnect through their head node. On the first day, they’re having some problems getting their cluster to scale. It sounds like a MPI problem; they can run on a single node, but can’t get the app to scale and use memory on other nodes. I have some inane potential solutions for them, which are discarded instantly.

They have a team morale officer who tells jokes to keep the team loose and having fun. This is always a good thing as student clustering is tense business.

Team Nova Tech:  Imagine my shock when I approached the team and found that they only had two members instead of four. This cluster competition puts a huge workload on a four-person team, it’s doubly huge for two (that’s just simple math, right?) This is the only team that has more nodes, at three, than team members. We’ll see how they hold up as the competition goes forward.

On our last day update, Team Nova Tech is still fighting. These guys are bone tired and it shows in the interview. They’ve completed three benchmarks but are still optimizing two of them to get a better final score. The biggest thing they’ve learned is to never, ever, rename library files. Hard won wisdom for the short handed team. Team Nova Tech also recommends reading the installation files and readme files – good advice in any context. These guys could have given up at any time, but they didn’t, they drove on and really impressed both the judges and other competitors.  

Witts Team One:  Witts University fielded two teams for this competition. This looks to be one of the better prepared teams, having put in lots of practice on a test cluster at their university. The team seemed pretty conventional in the interview until I got to Donald. Donald is in charge of compiling and optimizing the HPCC benchmark, which is an amalgamation of many benchmarks. He doesn’t see this as much of a challenge, which impressed me.

But what really impressed me about Donald was his confidence. When I asked the team how they felt about their chances to win, Donald responded “99.9%. I would have said 100% but nothing is ever for sure.” He also said, “we should start learning German now.” In my 10 years of Student Cluster Competition experience, I’ve never seen a player call his shot like Donald. In the student cluster world, he’s like Joe Namath, Muhammad Ali, Larry Bird and Michael Jordan all rolled into one. I love the whole team’s attitude and they’re obviously highly skilled.

Donald was particularly expressive in our follow-up interview. He complimented his teammates expansively and had some advice for the other team:  “pack up and go home.” Damn, I love this kid and his whole team! You gotta watch the video to see what I mean….

Witts Team A:  The second team from Witts looks to be solid as well. They were looking to containerize their applications but gave that up early on in order to get some solid results before optimizing to dial in their best possible numbers. When we meet the team, they’re down a member, but have compiled all of their benchmarks and were just starting the optimization process.

This is also a very confident team, like the other Witts team. Like Witts One, Witts A also guaranteed that they would be the winning team and make it to Germany. When we check back in on the final day, the team wasn’t quite as confident. Over night they had a node go down with a blown up motherboard. This has definitely hurt team morale, but they’re hopeful that the scores they submitted previously might be enough to put them over the top. But all is not well with Witts A, despite their great attitudes. It’s just an unlucky blow that seems to happen every once in a while. Ouch.

Team Two Nodes, One Cup:  Edgy name for a fun team. A name that made me stop in my tracks and read it two or three times before believing it. They’re truly a delightful team, great sense of humor and highly skilled. The team has divided up their workload well and seem like they have a good grasp on the tasks.

But they might be a little outgunned when it comes to hardware. The team is sporting dual workstations, each with 48 Xeon Silver cores and 92 GB of RAM. Where they might be ahead of the game is in their choice of network cards, they have selected high end network cards and might be driving double the bandwidth of other teams. We’ll see if that is enough as the competition unfolds. But this is a team that just won’t quit, despite running into some problems. Check them out in the video below…

Winners? All of Them

The winning team and the rest of the CHPC national team was announced at a gala closing banquet. Great food was served, entertainers entertained, and dignitaries delivered rousing speeches. But, for me at least, I was waiting impatiently for the awards for the Cyber, AI, and Student Cluster competitions to be handed out. (More details on the Cyber and AI competitions in upcoming stories.)

The Intel Award

Before the final student cluster team was named, there was some other business. Intel had very generously contributed a $5,000 scholarship for the most outstanding male and female competitors. I know that most of you probably haven’t been to South Africa, but let me tell you, injecting $5,000 into a college students’ life is a game changer for that student. Most of these kids are just getting by when it comes to finances and this award can make the difference between finishing college in four years vs. dropping out or taking much longer to complete their degree.

The Intel Award for this year went to our pal Donald winning on the male side and Sivenathi Madlokazi winning the female award.

Finally, the moment was at hand. The winning team and the foundation of the CHPC national team was….wait for it…Witts Team One – the team with our pal Donald Alungile. Now it was time to name the two other team members and the alternates. I’m going to let the video do the talking now….

There’s a Dell in Their Future

We’d be remiss if we didn’t mention that Dell is supporting the entire South Africa CHPC Student Cluster Competition with equipment, technical support, and money. This cluster competition has been supported from the start by Dell and they do a fantastic job. But Dell isn’t stopping there.

The next step for the team is to travel to Austin, Texas, on a Dell sponsored trip to get additional training from both Dell and the Texas Advanced Computing Center (TACC). Dell engineers will advise and collaborate with the team to design their ISC20 cluster, making sure that the CHPC students have the finest hardware available in the industry.

Cluster Competition, Meet COVID-19

The COVID crisis has forced the ISC20 Student Cluster Competition to go to a virtual format this year. This means that every team will be using the exact same cluster, a two-node system located in the Singapore National Supercomputing Center. While this is certainly a disappointment for the CHPC team, not to mention Dell, there isn’t anything anyone can do about it and all of the teams are facing the same conditions. We’ll see if CHPC can adapt and overcome, as they’ve done in the past.  

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Machines, Connections, Data, and Especially People: OAC Acting Director Amy Friedlander Charts Office’s Blueprint for Innovation

August 3, 2020

The path to innovation in cyberinfrastructure (CI) will require continued focus on building HPC systems and secure connections between them, in addition to the increasingly important goals of data best practices and work Read more…

By Ken Chiacchia, Pittsburgh Supercomputing Center/XSEDE

Nvidia Said to Be Close on Arm Deal

August 3, 2020

GPU leader Nvidia Corp. is in talks to buy U.K. chip designer Arm from parent company Softbank, according to several reports over the weekend. If consummated, analysts said the acquisition would cement Nvidia’s stat Read more…

By George Leopold

Summer Reading: Here’s a Quantum Advantage You Can Bet On!

August 3, 2020

While quantum computing researchers today vigorously chase a demonstration of a quantum advantage – an application which when run on a quantum computer provides sufficient advantage to warrant switching from a classica Read more…

By John Russell

What’s New in HPC Research: the LHC, Nuclear Reactors, Legion & More

August 1, 2020

In this bimonthly feature, HPCwire highlights newly published research in the high-performance computing community and related domains. From parallel programming to exascale to quantum computing, the details are here. Read more…

By Oliver Peckham

HPC Career Notes: August 2020 Edition

August 1, 2020

In this monthly feature, we’ll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it’s a promotion, new company hire, or even an accolade, we’ Read more…

By Mariana Iriarte

AWS Solution Channel

AWS announces the release of AWS ParallelCluster 2.8.0

AWS ParallelCluster is a fully supported and maintained open source cluster management tool that makes it easy for scientists, researchers, and IT administrators to deploy and manage High Performance Computing (HPC) clusters in the AWS cloud. Read more…

Intel® HPC + AI Pavilion

Supercomputing the Pandemic: Scientific Community Tackles COVID-19 from Multiple Perspectives

Since their inception, supercomputers have taken on the biggest, most complex, and most data-intensive computing challenges—from confirming Einstein’s theories about gravitational waves to predicting the impacts of climate change. Read more…

Heterogeneous Computing Gets a Code Similarity Tool

July 31, 2020

A machine programming framework for heterogeneous computing championed by Intel Corp. and university partners is built around an automated engine that analyzes code for similarities. The approach could eventually allow n Read more…

By George Leopold

Machines, Connections, Data, and Especially People: OAC Acting Director Amy Friedlander Charts Office’s Blueprint for Innovation

August 3, 2020

The path to innovation in cyberinfrastructure (CI) will require continued focus on building HPC systems and secure connections between them, in addition to the Read more…

By Ken Chiacchia, Pittsburgh Supercomputing Center/XSEDE

Nvidia Said to Be Close on Arm Deal

August 3, 2020

GPU leader Nvidia Corp. is in talks to buy U.K. chip designer Arm from parent company Softbank, according to several reports over the weekend. If consummated Read more…

By George Leopold

Intel’s 7nm Slip Raises Questions About Ponte Vecchio GPU, Aurora Supercomputer

July 30, 2020

During its second-quarter earnings call, Intel announced a one-year delay of its 7nm process technology, which it says it will create an approximate six-month shift for its CPU product timing relative to prior expectations. The primary issue is a defect mode in the 7nm process that resulted in yield degradation... Read more…

By Tiffany Trader

PEARC20 Plenary Introduces Five Upcoming NSF-Funded HPC Systems

July 30, 2020

Five new HPC systems—three National Science Foundation-funded “Capacity” systems and two “Innovative Prototype/Testbed” systems—will be coming onlin Read more…

By Ken Chiacchia, Pittsburgh Supercomputing Center/XSEDE

Nvidia Dominates Latest MLPerf Training Benchmark Results

July 29, 2020

MLPerf.org released its third round of training benchmark (v0.7) results today and Nvidia again dominated, claiming 16 new records. Meanwhile, Google provided e Read more…

By John Russell

$39 Billion Worldwide HPC Market Faces 3.7% COVID-related Drop in 2020

July 29, 2020

Global HPC market revenue reached $39 billion in 2019, growing a healthy 8.2 percent over 2018, according to the latest analysis from Intersect360 Research. A 3 Read more…

By Tiffany Trader

Agenting Change: PEARC20 Keynote Encourages Cultural Change to Make Tech Better, More Diverse

July 29, 2020

The tech world will need to become more diverse if it is to thrive and survive, said Cherri Pancake, director of the Northwest Alliance for Computational Resear Read more…

By Ken Chiacchia, Pittsburgh Supercomputing Center/XSEDE

In Big Win for COVID-19 Research, Neocortix Brings Arm Support to [email protected], [email protected]

July 28, 2020

Normally, Neocortix offers distributed cloud computing for its clients by way of PhonePaycheck, an app that pays users in exchange for the idle processing time Read more…

By Oliver Peckham

Supercomputer Modeling Tests How COVID-19 Spreads in Grocery Stores

April 8, 2020

In the COVID-19 era, many people are treating simple activities like getting gas or groceries with caution as they try to heed social distancing mandates and protect their own health. Still, significant uncertainty surrounds the relative risk of different activities, and conflicting information is prevalent. A team of Finnish researchers set out to address some of these uncertainties by... Read more…

By Oliver Peckham

Supercomputer-Powered Research Uncovers Signs of ‘Bradykinin Storm’ That May Explain COVID-19 Symptoms

July 28, 2020

Doctors and medical researchers have struggled to pinpoint – let alone explain – the deluge of symptoms induced by COVID-19 infections in patients, and what Read more…

By Oliver Peckham

Supercomputer Simulations Reveal the Fate of the Neanderthals

May 25, 2020

For hundreds of thousands of years, neanderthals roamed the planet, eventually (almost 50,000 years ago) giving way to homo sapiens, which quickly became the do Read more…

By Oliver Peckham

Intel’s 7nm Slip Raises Questions About Ponte Vecchio GPU, Aurora Supercomputer

July 30, 2020

During its second-quarter earnings call, Intel announced a one-year delay of its 7nm process technology, which it says it will create an approximate six-month shift for its CPU product timing relative to prior expectations. The primary issue is a defect mode in the 7nm process that resulted in yield degradation... Read more…

By Tiffany Trader

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

By Doug Black

Neocortex Will Be First-of-Its-Kind 800,000-Core AI Supercomputer

June 9, 2020

Pittsburgh Supercomputing Center (PSC - a joint research organization of Carnegie Mellon University and the University of Pittsburgh) has won a $5 million award Read more…

By Tiffany Trader

Honeywell’s Big Bet on Trapped Ion Quantum Computing

April 7, 2020

Honeywell doesn’t spring to mind when thinking of quantum computing pioneers, but a decade ago the high-tech conglomerate better known for its control systems waded deliberately into the then calmer quantum computing (QC) waters. Fast forward to March when Honeywell announced plans to introduce an ion trap-based quantum computer whose ‘performance’ would... Read more…

By John Russell

Nvidia’s Ampere A100 GPU: Up to 2.5X the HPC, 20X the AI

May 14, 2020

Nvidia's first Ampere-based graphics card, the A100 GPU, packs a whopping 54 billion transistors on 826mm2 of silicon, making it the world's largest seven-nanom Read more…

By Tiffany Trader

Leading Solution Providers

Contributors

Australian Researchers Break All-Time Internet Speed Record

May 26, 2020

If you’ve been stuck at home for the last few months, you’ve probably become more attuned to the quality (or lack thereof) of your internet connection. Even Read more…

By Oliver Peckham

15 Slides on Programming Aurora and Exascale Systems

May 7, 2020

Sometime in 2021, Aurora, the first planned U.S. exascale system, is scheduled to be fired up at Argonne National Laboratory. Cray (now HPE) and Intel are the k Read more…

By John Russell

‘Billion Molecules Against COVID-19’ Challenge to Launch with Massive Supercomputing Support

April 22, 2020

Around the world, supercomputing centers have spun up and opened their doors for COVID-19 research in what may be the most unified supercomputing effort in hist Read more…

By Oliver Peckham

Joliot-Curie Supercomputer Used to Build First Full, High-Fidelity Aircraft Engine Simulation

July 14, 2020

When industrial designers plan the design of a new element of a vehicle’s propulsion or exterior, they typically use fluid dynamics to optimize airflow and in Read more…

By Oliver Peckham

$100B Plan Submitted for Massive Remake and Expansion of NSF

May 27, 2020

Legislation to reshape, expand - and rename - the National Science Foundation has been submitted in both the U.S. House and Senate. The proposal, which seems to Read more…

By John Russell

John Martinis Reportedly Leaves Google Quantum Effort

April 21, 2020

John Martinis, who led Google’s quantum computing effort since establishing its quantum hardware group in 2014, has left Google after being moved into an advi Read more…

By John Russell

Google Cloud Debuts 16-GPU Ampere A100 Instances

July 7, 2020

On the heels of the Nvidia’s Ampere A100 GPU launch in May, Google Cloud is announcing alpha availability of the A100 “Accelerator Optimized” VM A2 instance family on Google Compute Engine. The instances are powered by the HGX A100 16-GPU platform, which combines two HGX A100 8-GPU baseboards using... Read more…

By Tiffany Trader

Nvidia Said to Be Close on Arm Deal

August 3, 2020

GPU leader Nvidia Corp. is in talks to buy U.K. chip designer Arm from parent company Softbank, according to several reports over the weekend. If consummated Read more…

By George Leopold

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This