Making the Team South Africa: Defending the Crown

By Dan Olds

June 15, 2020

As you read this article, 82 university students from 11 countries are working feverishly on a cluster located at the National Supercomputing Centre of Singapore to try to win the ISC 2020 Student Cluster Competition golden crown. Ok, there isn’t an actual golden crown, but there are trophies, including a big one for the Overall Champion.

One of these teams is from the Centre for High Performance Computing located in South Africa. This is their seventh appearance in the ISC cluster wars and they’ve built up an incredible record of four gold medals, two silver medals and a bronze. In other words, they have made the podium every single time they’ve competed.

This achievement is all the more impressive because each of their teams is a unique set of undergrads – no repeats allowed. Some teams have the same students appearing in every competition until they lose their eligibility and go pro. Not the case with South Africa, it’s one and done for them. Former team members mentor new members but can’t compete more than once in the big dance.

Little Dance Then Big Dance

The CHPC is the only organization that has a ‘play in’ round to select their ISC team. Early in the competition year, the word goes out to universities all over South Africa:  Put together your cluster teams. It’s go time.

The organization provides training materials and classes to help prepare the HPC beginners to compete at the CHPC HPC forum that occurs every December. At the forum, ten student cluster teams from various universities gather to duke it out to see who will be selected for the national team.

I had the privilege of attending the 2019 CHPC cluster competition and cover the three student competitions that took place:  the cluster competition, the cyber-security competition, and the AI competition. In this article, I’m going to take you through the cluster competition in detail.

Ten Teams – One Winner

Each team is composed of four undergraduate students. They are assisted by mentors from past CHPC cluster competition teams, which is very cool. The overall winning team will form the foundation of the national team, with two outstanding competitors from the non-winning teams and then two alternates.

Through the miracle of video and extra airline luggage fees to haul the equipment to Johannesburg, South Africa, I was able to interview each of the teams twice, once to meet them, then again towards the end of the competition as a check in. Let’s take a look…

Team Alt F4:  Named after the shut down command, this team is looking to shut down the other competitors. When we first check in on the team, they’re doing well, but are already tired when we reach them on the second day. This is one of those teams where everyone does everything without a lot of specialization.

When we check back in on the team, it’s a bit of a different story. When asked how they were doing, the mood was definitely different – they were in crunch time. They’ve been having problems compiling some of the applications, which is typical for these competitions.

Team It’s Spelt Bolognese:  this team has one of the more unusual names in the competition, a real head scratcher for me. So that’s of course, my first question for them. Explanation? Watch the video to see.

The team is driving a three-node cluster with a switch that is supposedly on the way but hasn’t arrived yet. (As it turns out, none of the teams get their switches in time, so they all go with point to point interconnects – old school, love it.)  The whole team is from Cape Town, so Johannesburg is, according to them, a real treat. When we check in with the team on the last day, they’re struggling to get some results to submit. Like some of the other teams, it’s the compilers that are the issue – trying to find the right compiler for each app. This is, as we’ve seen, a common story and one that we’ll hear again.

Team Ketamine:  Ketamine is a horse tranquilizer which kind of goes with the motif of their booth. It’s a tranquil place with mood lighting and a laid-back style. When we catch up to the team early on, their three-node cluster is working well and the team is working on getting their benchmarks compiled.

According to the team, it’s “vibe first, Germany second” meaning that their mood is more important than winning and getting the coveted trip to Frankfurt for the ISC finals. They have a ‘different concept’ about what winning should mean in this context. To them, having a great time with their friends while at the CHPC conference is the ultimate win. We get into a bit of a dispute about how well this attitude will serve them in the big picture. I can’t tell if they’re just yanking me or being serious, although the team says they are serious. Check out the video and see what I mean.

Team Send Nodes:  Send Nodes is learning the fine art of building switchless interconnects as we catch up with them on the first day. They’re soldiering through and getting the hang of it. The team is running what seems to be the standard three-node configuration with each node being a compute node – no need to have a dedicated head node in clusters this small, right?

The team has appointed a “Compiler Tsar” who is responsible for finding and selecting just the right compiler for the job – sort of like a HPC sous chef. When we interview the team on the last day, we find them busily putting the finishing touches on their applications and trying to get the best results possible. They’re still getting plenty of error messages, some of them unique to their team, which is a bit troubling. While they’ve gotten to the point where they get to use the NVIDIA V100 GPU nodes in the cloud, they’re having trouble getting Quantum ESPRESSO to compile so that they can run it on the cloudy infrastructure.

Team Vision 404:  Another interesting name. Combining “file not found” with “vision”, could be interpreted as a bad thing. The team sees it as hopeful, although I’m not sure why. Team 404 hasn’t really divided up their work to a great degree, but on further questioning, it seems like one guy is responsible for most of the applications/benchmarks. The team also has a ‘Designated Google Guy’, a surfer dude who does all of the team research and provides answers back to the other students. Good division of labor.

On the last day of the competition, Vision 404 is fired up. They’re tired, sure, but they know this is the time to drive hard. As we comment “don’t hate the player, hate the game”, so at this point they’re resigned to competing against themselves and for posterity. Great attitude, love their passion and drive to learn.

Team SomberSystem:  Kind of a sad name that was picked out of the blue by the team. They’re not all that somber, which is a good thing. Their system is three huge workstations connected by a point to point interconnect through their head node. On the first day, they’re having some problems getting their cluster to scale. It sounds like a MPI problem; they can run on a single node, but can’t get the app to scale and use memory on other nodes. I have some inane potential solutions for them, which are discarded instantly.

They have a team morale officer who tells jokes to keep the team loose and having fun. This is always a good thing as student clustering is tense business.

Team Nova Tech:  Imagine my shock when I approached the team and found that they only had two members instead of four. This cluster competition puts a huge workload on a four-person team, it’s doubly huge for two (that’s just simple math, right?) This is the only team that has more nodes, at three, than team members. We’ll see how they hold up as the competition goes forward.

On our last day update, Team Nova Tech is still fighting. These guys are bone tired and it shows in the interview. They’ve completed three benchmarks but are still optimizing two of them to get a better final score. The biggest thing they’ve learned is to never, ever, rename library files. Hard won wisdom for the short handed team. Team Nova Tech also recommends reading the installation files and readme files – good advice in any context. These guys could have given up at any time, but they didn’t, they drove on and really impressed both the judges and other competitors.  

Witts Team One:  Witts University fielded two teams for this competition. This looks to be one of the better prepared teams, having put in lots of practice on a test cluster at their university. The team seemed pretty conventional in the interview until I got to Donald. Donald is in charge of compiling and optimizing the HPCC benchmark, which is an amalgamation of many benchmarks. He doesn’t see this as much of a challenge, which impressed me.

But what really impressed me about Donald was his confidence. When I asked the team how they felt about their chances to win, Donald responded “99.9%. I would have said 100% but nothing is ever for sure.” He also said, “we should start learning German now.” In my 10 years of Student Cluster Competition experience, I’ve never seen a player call his shot like Donald. In the student cluster world, he’s like Joe Namath, Muhammad Ali, Larry Bird and Michael Jordan all rolled into one. I love the whole team’s attitude and they’re obviously highly skilled.

Donald was particularly expressive in our follow-up interview. He complimented his teammates expansively and had some advice for the other team:  “pack up and go home.” Damn, I love this kid and his whole team! You gotta watch the video to see what I mean….

Witts Team A:  The second team from Witts looks to be solid as well. They were looking to containerize their applications but gave that up early on in order to get some solid results before optimizing to dial in their best possible numbers. When we meet the team, they’re down a member, but have compiled all of their benchmarks and were just starting the optimization process.

This is also a very confident team, like the other Witts team. Like Witts One, Witts A also guaranteed that they would be the winning team and make it to Germany. When we check back in on the final day, the team wasn’t quite as confident. Over night they had a node go down with a blown up motherboard. This has definitely hurt team morale, but they’re hopeful that the scores they submitted previously might be enough to put them over the top. But all is not well with Witts A, despite their great attitudes. It’s just an unlucky blow that seems to happen every once in a while. Ouch.

Team Two Nodes, One Cup:  Edgy name for a fun team. A name that made me stop in my tracks and read it two or three times before believing it. They’re truly a delightful team, great sense of humor and highly skilled. The team has divided up their workload well and seem like they have a good grasp on the tasks.

But they might be a little outgunned when it comes to hardware. The team is sporting dual workstations, each with 48 Xeon Silver cores and 92 GB of RAM. Where they might be ahead of the game is in their choice of network cards, they have selected high end network cards and might be driving double the bandwidth of other teams. We’ll see if that is enough as the competition unfolds. But this is a team that just won’t quit, despite running into some problems. Check them out in the video below…

Winners? All of Them

The winning team and the rest of the CHPC national team was announced at a gala closing banquet. Great food was served, entertainers entertained, and dignitaries delivered rousing speeches. But, for me at least, I was waiting impatiently for the awards for the Cyber, AI, and Student Cluster competitions to be handed out. (More details on the Cyber and AI competitions in upcoming stories.)

The Intel Award

Before the final student cluster team was named, there was some other business. Intel had very generously contributed a $5,000 scholarship for the most outstanding male and female competitors. I know that most of you probably haven’t been to South Africa, but let me tell you, injecting $5,000 into a college students’ life is a game changer for that student. Most of these kids are just getting by when it comes to finances and this award can make the difference between finishing college in four years vs. dropping out or taking much longer to complete their degree.

The Intel Award for this year went to our pal Donald winning on the male side and Sivenathi Madlokazi winning the female award.

Finally, the moment was at hand. The winning team and the foundation of the CHPC national team was….wait for it…Witts Team One – the team with our pal Donald Alungile. Now it was time to name the two other team members and the alternates. I’m going to let the video do the talking now….

There’s a Dell in Their Future

We’d be remiss if we didn’t mention that Dell is supporting the entire South Africa CHPC Student Cluster Competition with equipment, technical support, and money. This cluster competition has been supported from the start by Dell and they do a fantastic job. But Dell isn’t stopping there.

The next step for the team is to travel to Austin, Texas, on a Dell sponsored trip to get additional training from both Dell and the Texas Advanced Computing Center (TACC). Dell engineers will advise and collaborate with the team to design their ISC20 cluster, making sure that the CHPC students have the finest hardware available in the industry.

Cluster Competition, Meet COVID-19

The COVID crisis has forced the ISC20 Student Cluster Competition to go to a virtual format this year. This means that every team will be using the exact same cluster, a two-node system located in the Singapore National Supercomputing Center. While this is certainly a disappointment for the CHPC team, not to mention Dell, there isn’t anything anyone can do about it and all of the teams are facing the same conditions. We’ll see if CHPC can adapt and overcome, as they’ve done in the past.  

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

University of Chicago Researchers Generate First Computational Model of Entire SARS-CoV-2 Virus

January 15, 2021

Over the course of the last year, many detailed computational models of SARS-CoV-2 have been produced with the help of supercomputers, but those models have largely focused on critical elements of the virus, such as its Read more…

By Oliver Peckham

Pat Gelsinger Returns to Intel as CEO

January 14, 2021

The Intel board of directors has appointed a new CEO. Intel alum Pat Gelsinger is leaving his post as CEO of VMware to rejoin the company that he parted ways with 11 years ago. Gelsinger will succeed Bob Swan, who will remain CEO until Feb. 15. Gelsinger previously spent 30 years... Read more…

By Tiffany Trader

Roar Supercomputer to Support Naval Aircraft Research

January 14, 2021

One might not think “aircraft” when picturing the U.S. Navy, but the military branch actually has thousands of aircraft currently in service – and now, supercomputing will help future naval aircraft operate faster, Read more…

By Staff report

DOE and NOAA Extend Computing Partnership, Plan for New Supercomputer

January 14, 2021

The National Climate-Computing Research Center (NCRC), hosted by Oak Ridge National Laboratory (ORNL), has been supporting the climate research of the National Oceanic and Atmospheric Administration (NOAA) for the last 1 Read more…

By Oliver Peckham

Using Micro-Combs, Researchers Demonstrate World’s Fastest Optical Neuromorphic Processor for AI

January 13, 2021

Neuromorphic computing, which uses chips that mimic the behavior of the human brain using virtual “neurons,” is growing in popularity thanks to high-profile efforts from Intel and others. Now, a team of researchers l Read more…

By Oliver Peckham

AWS Solution Channel

Now Available – Amazon EC2 C6gn Instances with 100 Gbps Networking

Amazon EC2 C6gn instances powered by AWS Graviton2 processors are now available!

Compared to C6g instances, this new instance type provides 4x higher network bandwidth, 4x higher packet processing performance, and 2x higher EBS bandwidth. Read more…

Intel® HPC + AI Pavilion

Intel Keynote Address

Intel is the foundation of HPC – from the workstation to the cloud to the backbone of the Top500. At SC20, Intel’s Trish Damkroger, VP and GM of high performance computing, addresses the audience to show how Intel and its partners are building the future of HPC today, through hardware and software technologies that accelerate the broad deployment of advanced HPC systems. Read more…

Honing In on AI, US Launches National Artificial Intelligence Initiative Office

January 13, 2021

To drive American leadership in the field of AI into the future, the National Artificial Intelligence Initiative Office has been launched by the White House Office of Science and Technology Policy (OSTP). The new agen Read more…

By Todd R. Weiss

Pat Gelsinger Returns to Intel as CEO

January 14, 2021

The Intel board of directors has appointed a new CEO. Intel alum Pat Gelsinger is leaving his post as CEO of VMware to rejoin the company that he parted ways with 11 years ago. Gelsinger will succeed Bob Swan, who will remain CEO until Feb. 15. Gelsinger previously spent 30 years... Read more…

By Tiffany Trader

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

By John Russell

Intel ‘Ice Lake’ Server Chips in Production, Set for Volume Ramp This Quarter

January 12, 2021

Intel Corp. used this week’s virtual CES 2021 event to reassert its dominance of the datacenter with the formal roll out of its next-generation server chip, the 10nm Xeon Scalable processor that targets AI and HPC workloads. The third-generation “Ice Lake” family... Read more…

By George Leopold

Researchers Say It Won’t Be Possible to Control Superintelligent AI

January 11, 2021

Worries about out-of-control AI aren’t new. Many prominent figures have suggested caution when unleashing AI. One quote that keeps cropping up is (roughly) th Read more…

By John Russell

AMD Files Patent on New GPU Chiplet Approach

January 5, 2021

Advanced Micro Devices is accelerating the GPU chiplet race with the release of a U.S. patent application for a device that incorporates high-bandwidth intercon Read more…

By George Leopold

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

By Tracey Bryant

Intel Touts Optane Performance, Teases Next-gen “Crow Pass”

January 5, 2021

Competition to leverage new memory and storage hardware with new or improved software to create better storage/memory schemes has steadily gathered steam during Read more…

By John Russell

Farewell 2020: Bleak, Yes. But a Lot of Good Happened Too

December 30, 2020

Here on the cusp of the new year, the catchphrase ‘2020 hindsight’ has a distinctly different feel. Good riddance, yes. But also proof of science’s power Read more…

By John Russell

Esperanto Unveils ML Chip with Nearly 1,100 RISC-V Cores

December 8, 2020

At the RISC-V Summit today, Art Swift, CEO of Esperanto Technologies, announced a new, RISC-V based chip aimed at machine learning and containing nearly 1,100 low-power cores based on the open-source RISC-V architecture. Esperanto Technologies, headquartered in... Read more…

By Oliver Peckham

Azure Scaled to Record 86,400 Cores for Molecular Dynamics

November 20, 2020

A new record for HPC scaling on the public cloud has been achieved on Microsoft Azure. Led by Dr. Jer-Ming Chia, the cloud provider partnered with the Beckman I Read more…

By Oliver Peckham

NICS Unleashes ‘Kraken’ Supercomputer

April 4, 2008

A Cray XT4 supercomputer, dubbed Kraken, is scheduled to come online in mid-summer at the National Institute for Computational Sciences (NICS). The soon-to-be petascale system, and the resulting NICS organization, are the result of an NSF Track II award of $65 million to the University of Tennessee and its partners to provide next-generation supercomputing for the nation's science community. Read more…

Is the Nvidia A100 GPU Performance Worth a Hardware Upgrade?

October 16, 2020

Over the last decade, accelerators have seen an increasing rate of adoption in high-performance computing (HPC) platforms, and in the June 2020 Top500 list, eig Read more…

By Hartwig Anzt, Ahmad Abdelfattah and Jack Dongarra

Aurora’s Troubles Move Frontier into Pole Exascale Position

October 1, 2020

Intel’s 7nm node delay has raised questions about the status of the Aurora supercomputer that was scheduled to be stood up at Argonne National Laboratory next year. Aurora was in the running to be the United States’ first exascale supercomputer although it was on a contemporaneous timeline with... Read more…

By Tiffany Trader

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

By John Russell

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

By Doug Black

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

By Tracey Bryant

Leading Solution Providers

Contributors

Top500: Fugaku Keeps Crown, Nvidia’s Selene Climbs to #5

November 16, 2020

With the publication of the 56th Top500 list today from SC20's virtual proceedings, Japan's Fugaku supercomputer – now fully deployed – notches another win, Read more…

By Tiffany Trader

Texas A&M Announces Flagship ‘Grace’ Supercomputer

November 9, 2020

Texas A&M University has announced its next flagship system: Grace. The new supercomputer, named for legendary programming pioneer Grace Hopper, is replacing the Ada system (itself named for mathematician Ada Lovelace) as the primary workhorse for Texas A&M’s High Performance Research Computing (HPRC). Read more…

By Oliver Peckham

At Oak Ridge, ‘End of Life’ Sometimes Isn’t

October 31, 2020

Sometimes, the old dog actually does go live on a farm. HPC systems are often cursed with short lifespans, as they are continually supplanted by the latest and Read more…

By Oliver Peckham

Nvidia and EuroHPC Team for Four Supercomputers, Including Massive ‘Leonardo’ System

October 15, 2020

The EuroHPC Joint Undertaking (JU) serves as Europe’s concerted supercomputing play, currently comprising 32 member states and billions of euros in funding. I Read more…

By Oliver Peckham

Gordon Bell Special Prize Goes to Massive SARS-CoV-2 Simulations

November 19, 2020

2020 has proven a harrowing year – but it has produced remarkable heroes. To that end, this year, the Association for Computing Machinery (ACM) introduced the Read more…

By Oliver Peckham

Nvidia-Arm Deal a Boon for RISC-V?

October 26, 2020

The $40 billion blockbuster acquisition deal that will bring chipmaker Arm into the Nvidia corporate family could provide a boost for the competing RISC-V architecture. As regulators in the U.S., China and the European Union begin scrutinizing the impact of the blockbuster deal on semiconductor industry competition and innovation, the deal has at the very least... Read more…

By George Leopold

Intel Xe-HP GPU Deployed for Aurora Exascale Development

November 17, 2020

At SC20, Intel announced that it is making its Xe-HP high performance discrete GPUs available to early access developers. Notably, the new chips have been deplo Read more…

By Tiffany Trader

HPE, AMD and EuroHPC Partner for Pre-Exascale LUMI Supercomputer

October 21, 2020

Not even a week after Nvidia announced that it would be providing hardware for the first four of the eight planned EuroHPC systems, HPE and AMD are announcing a Read more…

By Oliver Peckham

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This