ASC24: The Battle, The Apps, and The Competitors

By Dan Olds

June 5, 2024

The ASC24 (Asia Supercomputer Community) Student Cluster Competition was one for the ages. More than 350 university teams worked for months in the preliminary competition to earn one of the 25 final competition slots. The winning teams then assembled (some within walking distance, others flying thousands of miles) in Shanghai, at Shanghai University, the host school for the finals.

The competition was held in a large Shanghai University gymnasium/stadium, with the scoreboards lit up with team names and how much juice their systems were pulling.

At 25 teams, this is the biggest cluster competition in the world and is certainly stadium worthy – plus it keeps the heat controlled. But it doesn’t do much to help the noise when you’re on the show floor.

Like every other ASC competition, there is a strict 3,000 watt power cap. Go over the cap and you’ll hear a loud siren and see your team name on the scoreboard turn red. And, to answer your next question, yes, you hear the siren a LOT, as  you’ll see in the video interviews below.

One great thing about this competition is how well the organizers take care of the students. They pick up the tab for hotels, meals, and local transportation. In addition, they also provide systems for each of the teams. Students are given a menu of components to select from, so they can customize the cluster to their needs. One exception is accelerators – students can (and should) bring their own GPUs but have to make sure they comply with current regulations.

The Tasks

This is hands down the most difficult student cluster competition in the world from an application standpoint. The applications they select range from cutting edge machine learning and LLMs to older apps that twist student minds by making them get up close and personal with things like FORTRAN.

The app list for 2024 carries on that tradition with the following benchmarks, real-world applications, and tasks:

  • Benchmarks: The usual HPL (LINPACK) and HPCG to get the students and systems off the starting blocks.
  • LLM Inference: The teams will be charged with optimizing inference operations on the open source AquilaChat2 LLM, the 34 billion parameter flavor. What do they use, how do they do it? It’s wide open. Students need to maintain accuracy while reducing runtime to win on this one. We’ll be posting an an interview with Dr. Yong Hua Lin, Vice President and Chief Engineer at the Beijing Academy of Artificial Intelligence where she explains what determines how easy or hard it will be to process inference traffic on a LLM.
  • OpenCAEPoro: this one is all about leaking, anything from fluids to gases, if it leaks from somewhere, you can simulate it with OpenCAEPoro. This turned out to be the most difficult application for the students. Competition rules didn’t allow them to make wholesale changes to the code and it turns out that the places where they could optimize were scattered everywhere, and to make it even tougher, the optimization benefits were relatively small.
  • GoMars: If you’re looking to land a probe on the planet Mars, you’re going to want to know the weather, right? Landing in the middle of a furious dust storm could spell disaster for your carefully crafted vehicle. Why gamble on sunny skies when you can use GoMars to generate Mars weather predictions and simulations? Students are charged with optimizing this app, with the shortest runtime, while maintaining accuracy, on the provided dataset taking home the prize.
  • Mystery Application: This year, it’s WannierTools, which are associated with Wannier Functions that are used in solid-state physics. It’s an open-source package that examines the physical properties of tight-binding models. Deep science stuff indeed.
  • Group Application: this is where a representative from each team is joined with others to create five randomly selected groups and then tasked with optimizing the ParaSeis seismic wave propagation simulation package.
  • Final Presentation: each team will have to give a presentation in front of a panel of HPC experts who will judge them on various criteria. Judges question students along the way to see if they understand the applications and why they made particular choices when trying to tune them.

Meet the ASC24 Teams

Beihang University:  This is the first competition for these particular students, but Beihang has participated in several ASC competitions in the past. They have an interesting strategy for OpenCAEPoro, which everyone seems to agree is the most difficult application. Team Beihang has decided to put a higher priority on optimizing everything else first and then working on OpenCAEPoro. They believe that putting more effort into the other tasks will result in a better overall score. Are they correct? We shall see…

 

The Chinese University of Hong Kong:  One of the first people we talked to in the interview introduced himself as the team’s system administrator, which prompted me to take a look at their cable management. They did a reasonably nice job, not as good as FAU, but it’s ok.

In the interview, we discuss power management and it looks like the team has a good handle on it, which is good since this is probably the most critical part of the competition. The team is using 8 Nvidia L20 GPUs but is having some problems getting all of the applications to see and use them at this point in the competition.

 

National University of Cordoba:, A first time team all the way from Cordoba, Argentina, which is at best a 24 hour long flight to Shanghai. The team is taking sort of a “all hands on deck” approach with everyone doing a bit of something to help out. The team is a bit behind the 8-ball due to a physical problem with their GPUs not fitting into the provided nodes.

They did manage to shoe-horn in a couple, which is better than nothing. The team also hasn’t had a lot of hands-on work with GPUs, which is also a bit of a problem. But they do have some cluster competition experience, having finished third (out of 15 teams) at the SC23 Indy virtual competition, which isn’t too shabby.

(Additional note:  the travel arrangements for Team Cordoba weren’t great. They were arriving a day before the competition started and leaving the night it ended, giving them no chance to see Shanghai and China outside of the Shanghai University campus. When the organizers heard about this, they helped the team to change their departure until Thursday, giving them four days to experience China. Great job, ASC organizers!)

 

Friedrich-Alexander University, Germany:  We caught up with Team FAU on the first day of the competition, during the testing period. The team will be running maybe as many as six nodes, but probably five, with two nodes loaded with V100 GPUs and one with two L20s. There are some troubles this Monday morning, however.

They’re having problems getting the benchmarks running, even though it worked fine on the creaky old system they used at home. It took six hours to download the container with their HPL/HPCG instances. Yikes. That’s why I advise bringing physical copies of everything – load up some big thumb drives. They’ll be ok, I think.

After I turned off the camera, they showed me their cable management. We should have led with that, they did a magnificent job, as you’ll see from the video. Really, it was magnificent.

 

Fuzhou University:  This university has participated in several ASC competitions over the years and it’s great to see them at ASC24. This is one of the first times we’ve used our translator April for a team interview and it’s a little bit rough at first to get the students to speak in short sentences so that she can relay what they’re saying. But she deals with it in good humor so it’s all good.

 

Harbin Institute of Technology:  We get team introductions courtesy of our new translator Charlotte. For some reason, I joke that I’m about to fire her, but I can’t remember why, however she’s laughing, so I’m not as big a jerk as I sound. This is the second competition for Harbin and comes on the heels of their first outing at ASC23.

The team seems to be doing ok at this point but is running into network problems with their cluster due to some network changes/constraint changes. Otherwise, it seems like smooth sailing.

 

Hong Kong Polytechnic University:  It is incredibly loud in Cluster Stadium. Even though the teams are spread out around the edges of the court, the screaming and yowling of the clusters is too loud for comfortable conversation.

Team Hong Kong Poly seems to be unaffected by the noise as they prepare for the start of the competition. In the interview, we meet the team and learn what they’re working on. Their hardware seems to be working well and they were pleasantly surprised to receive Nvidia L20 GPUs from their Hong Kong sponsor, which will certainly come in handy.

The team seems to agree that OpenCAEPoro is the most difficult application due to how hard it is for them to figure out exactly how the code works and where it can be optimized. We’ve heard from other teams that even when you find places to streamline the code, the overall performance benefit is on the minimal side. On the other hand, they are finding that there are a lot of ways to optimize the LLM inference task, which is good to hear.

 

Huazhong University of Science & Technology:  This school has sent many teams to ASC competitions over the years and has even hosted the ASC event back in 2016. Huazhong turned their home team advantage into a big win that year and went on to compete at ISC16 later that year. Like other teams, they also see OpenCAEPoro as the most challenging task in the competition, but are dealing with it well, it seems. I had some fun interplay with the team in terms of goal setting (they had achieved 2x on one application, I convinced they to go for 10x).

On a sad note, I examined their cable management and had to counsel them to do better and advised them to check out FAU. But, regardless of the cables, the system is working well. It’s a pretty good system with six nodes and up to 12 GPUs available to use. They don’t exactly know their final configuration yet, they’re still testing the apps and evaluating the power draw.

I figure they’ll probably end up using three nodes and six GPUs, but they do me better by proposing three nodes and nine GPUs – which should be pretty sporty. We also have some fun with the one of the team members who presented himself as the “detail guy” for the team. We had a lot of laughs in this one.

 

Jinan University:  This is the seventh cluster competition appearance for Jinan University, although it’s the first for these students. Team Jinan took home the ASC21 championship and won bronze at the ISC21 competition. The team has two students covering OpenCAEPoro, which is nearly universally acknowledged as the most difficult application in the ASC24 line up.

We did a cable management check and they did a good job, not FAU quality, but decently close.

The team will be running three nodes with six Nvidia A800 GPUs and will need to monitor the power closely. At full out, their system could pull 4kW, which is way above the 3kW limit, of course. The team points out that the CPUs used this year in the organizer provided clusters take significantly more power than in previous years, making their monitoring/throttling job even more difficult.

 

Kasetsart University:  This is the sixth competition for a team from Bangkok, Thailand based Kasetsart University (four ASC outings and one ISC event). The team will be running four nodes with four Nvidia 3090 consumer GPUs that they brought from home. I’m not sure how well they’re perform on the HPC/AI tasks, they’re much better at graphics and gaming, as we discuss in the interview.

We go into a little detail about the challenge posed by the LLM inference optimization problem, but keep in mind it’s early in the competition and they’re still testing their system and applications.

 

Lanzhou University:  I start with my typical mispronunciation of the university name but we quickly get past it. This is the third consecutive ASC competition for the school located in north-central China. The team is running three nodes with six Nvidia A100 GPUs. Depending on the exact model, these will draw between 250w to 400w without throttling, meaning a total possible load of 1,500 to 2,400w, which isn’t too hard to handle if they keep their eye on it. The team has dedicated a member to power management, which should help out a lot.

I manage to bore the entire team senseless as I tell them the story of the Team Zhejiang “Suicide LINPACK” from ASC16. That was sort of fun for me, actually, watching their eyes glaze over.

The team sees the traditional Mystery Application as a big challenge because they will only have a limited time to build, run, and optimize it. One member also points out that the LLM inference challenge is pretty tough due to the number of models they have to run to figure out the best solution. But after overcoming some early problems with cluster management, the team looks to be in good shape.

 

Macau University of Science & Technology:  I like all the student cluster competition teams I interview and interact with, but this one I really like. It’s a brand new team from the island of Macau – first time doing any of this stuff. Plus, they’re short-handed with only three members. If that weren’t enough, Team Macau is running six nodes but only has two Nvidia L20 GPUs, which puts them well behind the other teams. If this isn’t a great underdog story, I don’t know what is.

They’re here because they want to learn HPC/AI and they want to bring back their experience to Macau and teach a team for next year. Did the little team from Macau give us a Hollywood ending by winning the championship against all odds?  Spoiler alert:  no, they didn’t.

But they learned a lot by participating in the toughest cluster competition in the world and they’re going to come back next year. That’s a big win in my book.

Welcome to the world of bigtime HPC/AI, Team Macau, it’s great having you here.

 

National Tsing Hua University:  The school has participated in a LOT of cluster  competitions – 23 ASC, ISC, and SC events, to be precise. They’ve racked up a lot of awards along the way, including a championship at ASC19. Win or lose, they always send a sold team.

At ASC24, they’re driving a four node cluster with each carrying dual Nvidia L40 GPUs. While Team NTHU is used to using Nvidia A100s, they point out that the L40s are a pretty good substitute for them, which is true, from what I hear.

 

Peking University:  Hailing from Beijing, Peking University has become a formidable student cluster competition team over the past several years. But the team I interviewed on the first day of the ASC24 competition was facing problems. Team Peking is running four nodes with dual Nvidia A100s on each. That’s quite a bit of power to fit under the 3kw power cap, but they’ve proven to be pretty good at this. It’s a confident bunch at this point in the competition and seem to have a good handle on the applications.

Back to the problems, it seems like changes they’ve made to their software packages have resulted in the system not being able to boot. Their competition cluster is running different CPUs than their practice cluster back at home, so it looks like a kernel swap is in their future. They have plenty of time, so should be ok.

Like other teams, they see OpenCAEPoro as the most difficult application as ASC24, explaining that there really isn’t that much of the code that can be modified under competition rules. Compounding this is that the parts of the app that compute results are scattered all over the code, which makes it hard to ferret out and optimize. Team Peking took home the ASC23 championship trophy, can they do it again in 2024?

 

Qilu University of Technology:  This is the second time that Qilu University has sent a team to the ASC cluster competition. The first question in our interview seemed to bring up a problem:  the LLM inference application wasn’t running yet. But they have time to fix it and they do have OpenCAEProo and the group problem running, so that’s pretty good.

I drop a Yoda-ism on one of the students with a timely “don’t try, do” tossed into the mix. I also made fun of their cable management skills, which is always fun for me. Team Qilu agrees with most of the others with their belief that OpenCAEPoro is the most difficult app in the competition. They’re still early in the process but they have profiled it and that’s a good start.

 

Qinghai University:  Making their fourth appearance at ASC is Team Qinghai. As we catch up to the team, they’re running and seem fine hardware wise. And although they see OpenCAEPoro as the toughest application, they’ve worked with it and have some confidence in their skills. We talk some GoMars and mainly agree that the atmosphere on Mars probably isn’t so great.

We had some help from our translator Charlotte on that as my rapid fire English was rough for the GoMars specialist….I need to slow down more, but my caffeine level at this point in the interview schedule is way off the charts.

 

Shanghai Jiao Tong University:  This school is a very familiar presence at cluster competitions world-wide. They’ve fielded teams in 12 different cluster competitions since winning the ASC14 championship in their debut. The teams from Shanghai Jiao Tong have nearly always finished in the top echelon but haven’t nabbed another championship since 2014.

In the interview, we discuss how critical power control is in this competition. Every application has a power profile, according to the team, and it’s vital to correctly balance CPU, GPU, and even DRAM power draws to stay under the limit but get maximum performance.

When we catch up with the team, they’re still rooting out some problems with their cluster file system, but they seem confident that they’ll nail them down before the competition formally starts. On the GoMars front, they understand how the application works just fine, it’s the optimizing that’s the hard part for them.  On the hardware side, they’re going to be running a four or five node cluster with as many as six Nvidia A800. The final configuration depends on the results from their testing.

 

Shanghai University:  This is the home team for ASC24 and they’re participating in only their second ASC competition. In the team interview, we meet the students and they display their knowledge of programming languages and other techie stuff.

When asked about the most difficult part of the competition, they discuss the difficulty they had in obtaining the right hardware – meaning GPU accelerators. It all worked out in the end and they team now has eight Nvidia A800 GPUs under the hood, two A800s in each of their four nodes.

They’ve written scripts to control the power, definitely a good idea, but I wonder if they have the skills (yet) to control power consumption to the degree that other more experienced teams have learned? Time will tell…

 

Shanxi University:  This is the fourth ASC cluster competition for Shanxi University. In the time of the interview, we find that the student in charge of the LLM inference task is a new substitute on the team. I get her to commit to a 20x speed up, which is very ambitious, right?  One of the team members is working on GoMars, but is also in charge of GPU optimization, which is the first time I’ve seen a team have a dedicated GPU specialist. The team has up to eight Nvidia A800 GPUs to use with their four nodes.

At this point, the team seems to be in good shape and have all of the apps up and running. When asked about what they’ve learned, one of the team members grabs the mic and says “I have learned a lot about FORTRAN and I have learned to never touch FORTRAN again…” or words to that effect. Gotta feel for the guy, right?

 

Sun Yat-Sen University:  This school has sent 13 teams to cluster competitions since their debut at ASC14 on their home field in Guangzhou, China. In the interview, we learn that the team is sporting eight Nvidia A800 GPUs, which are pretty good on power.

This is another team that has a dedicated hardware power control person, which is a good move, I think. As he put it “I am trying to squeeze every single Gflop out of the 3,000 watts.” We talk a little bit about the art and science of power throttling. We also talk some GoMars and find that it’s memory bandwidth bound and also an I/O hog.

The team also discusses why OpenCAEPoro is such a hard application to understand and optimize. They are also concerned about the number of cases they will be presented with, which prompts me to relate a cluster competition war story from 2010 or so, where NTHU cherry picked the data sets rather than just doing them sequentially. It was a winning strategy.

 

Southern University of Science & Technology:  ASC24 marks the seventh competition for SUSTech, which is the cool way to refer to them. The university, founded in 2011, is one of the newest major universities in China. Nestled in perhaps the loudest area of Cluster Arena, it was very hard to converse – even with near shouting. Team SUSTech is currently testing their cluster to decide on the best configuration for the competition. They figure they’ll use 4-5 nodes with six GPUs located on two nodes.

SUSTech also has a dedicated power throttling specialist who will be carrying the ball on HPL and HPCG as well. We kick around a bit of power strategy and he has a good handle on it.

When it comes to OpenCAEPoro, they’re stuck running on only a single node – which seems sort of a common problem early on. They were able to scale it at their home lab, but it’s just not happening yet on their competition box. The team has already faced some problems with the change in network rules, but powered through them by the time of this interview.

 

Southwest Petroleum University:  Hailing from the Sichuan province area of China, Southwest Petroleum University is a first-time cluster competition competitor. It’s a big school, with more than 32,000 students who are fueled by a rousing slogan:  “Righteous deed, persistent quest, extensive learning and creative mind.”

Despite the name, the school is about much more than just oil (and gas). They offer a wide variety of majors, including Computer Science and a school of Electronics & Information Engineering. With the help of our translator April (who did a great job), we meet the team and manage a bit of chat.

The student in charge of GoMars thinks the application is complex but that there is solid chance of making it run faster. In talking to the LLM specialist, I lay out a challenge for him to increase 3x performance, which he accepts in good humor. On the hardware side, the team is running four nodes with maybe as many as eight Nvidia L20 GPUs, but won’t know for sure until they’ve done more testing.

 

University of Science & Technology of China:  For those of you keeping track, the ASC23 cluster competition was hosted by USTC at their home base in Hefei, China. In the interview with the team, we find that two people are dedicated to GoMars, with one devoted to the LLM inference optimization task. That seems a little unusual to me, but what do I know, right?

Team USTC is running four nodes each equipped with dual Nvidia A800 GPUs. They’ll need to do some throttling for sure, as running full out will definitely cross the 3kW power cap. The hardware is working and they’re able to run the apps, so all good on that front. Can Team USTC follow up their ASC23 Sliver medal and ePrize again at ASC24? We’ll see what happens.

 

Zhejiang University:  Last in the (English) alphabet, but they haven’t finished last in a cluster competition. Ever. This will be their sixth ASC cluster competition (they’ve also competed at ISC in Germany twice). One thing about Team Zhejiang is that they love their LINPACK. They’ve taken home the Highest LINPACK trophy twice in ASC competitions and also were the team that ran the infamous “Suicide LINPACK” at ASC16. (I tell that story in the Lanzhou interview.)

We learn how the team is dealing with OpenCAEPoro, mainly by realizing that there isn’t a lot they are allowed to tune, which reduces potential optimization benefits significantly.  The student working on the LLM inference challenge has had to ramp up his skills over the past through months as this is all new to him.

The team is trying to optimize GoMars for GPUs but finding it a rocky road. In their cluster, they initially tried to use 10 Nvidia A100 GPUs but, no way they could do that and stay under the power cap. Their configuration includes a node that can handle up to 10 GPUs, which gives them lots of flexibility. The noise in the area is such that I have to get into the camera frame in order to hear them. Very unprofessional of me.

 

Ok, we’ve learned the rules, discussed the applications, and had a meet and greet with the teams. Phew, that was a lot of work and words, but well worth it.

Next up we’re going to the experts. We’ll be talking to Dr. Y0ng Hua Lin, Vice President and Chief Engineer of the Beijing Academy of Artificial Intelligence to get a deep dive on inference and how to optimize it. Then we’ll have a conversation with two of the biggest names in HPC, Dr. Jack Dongarra and Dr. Torsten Hoefler, who were also judges at ASC24. Stay tuned…..

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

HPE and NVIDIA Join Forces and Plan Conquest of Enterprise AI Frontier

June 20, 2024

The HPE Discover 2024 conference is currently in full swing, and the keynote address from Hewlett-Packard Enterprise (HPE) CEO Antonio Neri on Tuesday, June 18, was an unforgettable event. Other than being the first busi Read more…

Slide Shows Samsung May be Developing a RISC-V CPU for In-memory AI Chip

June 19, 2024

Samsung may have unintentionally revealed its intent to develop a RISC-V CPU, which a presentation slide showed may be used in an AI chip. The company plans to release an AI accelerator with heavy in-memory processing, b Read more…

ASC24 Student Cluster Competition: Who Won and Why?

June 18, 2024

As is our tradition, we’re going to take a detailed look back at the recently concluded the ASC24 Student Cluster Competition (Asia Supercomputer Community) to see not only who won the various awards, but to figure out Read more…

Qubits 2024: D-Wave’s Steady March to Quantum Success

June 18, 2024

In his opening keynote at D-Wave’s annual Qubits 2024 user meeting, being held in Boston, yesterday and today, CEO Alan Baratz again made the compelling pitch that D-Wave’s brand of analog quantum computing (quantum Read more…

Apple Using Google Cloud Infrastructure to Train and Serve AI

June 18, 2024

Apple has built a new AI infrastructure to deliver AI features introduced in its devices and is utilizing resources available in Google's cloud infrastructure.  Apple's new AI backend includes: A homegrown foun Read more…

Argonne’s Rick Stevens on Energy, AI, and a New Kind of Science

June 17, 2024

The world is currently experiencing two of the largest societal upheavals since the beginning of the Industrial Revolution. One is the rapid improvement and implementation of artificial intelligence (AI) tools, while the Read more…

HPE and NVIDIA Join Forces and Plan Conquest of Enterprise AI Frontier

June 20, 2024

The HPE Discover 2024 conference is currently in full swing, and the keynote address from Hewlett-Packard Enterprise (HPE) CEO Antonio Neri on Tuesday, June 18, Read more…

Slide Shows Samsung May be Developing a RISC-V CPU for In-memory AI Chip

June 19, 2024

Samsung may have unintentionally revealed its intent to develop a RISC-V CPU, which a presentation slide showed may be used in an AI chip. The company plans to Read more…

Qubits 2024: D-Wave’s Steady March to Quantum Success

June 18, 2024

In his opening keynote at D-Wave’s annual Qubits 2024 user meeting, being held in Boston, yesterday and today, CEO Alan Baratz again made the compelling pitch Read more…

Shutterstock_666139696

Argonne’s Rick Stevens on Energy, AI, and a New Kind of Science

June 17, 2024

The world is currently experiencing two of the largest societal upheavals since the beginning of the Industrial Revolution. One is the rapid improvement and imp Read more…

Under The Wire: Nearly HPC News (June 13, 2024)

June 13, 2024

As managing editor of the major global HPC news source, the term "news fire hose" is often mentioned. The analogy is quite correct. In any given week, there are Read more…

Labs Keep Supercomputers Alive for Ten Years as Vendors Pull Support Early

June 12, 2024

Laboratories are running supercomputers for much longer, beyond the typical lifespan, as vendors prematurely deprecate the hardware and stop providing support. Read more…

MLPerf Training 4.0 – Nvidia Still King; Power and LLM Fine Tuning Added

June 12, 2024

There are really two stories packaged in the most recent MLPerf  Training 4.0 results, released today. The first, of course, is the results. Nvidia (currently Read more…

Highlights from GlobusWorld 2024: The Conference for Reimagining Research IT

June 11, 2024

The Globus user conference, now in its 22nd year, brought together over 180 researchers, system administrators, developers, and IT leaders from 55 top research Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium

May 30, 2024

Consider the GPU. An island of SIMD greatness that makes light work of matrix math. Originally designed to rapidly paint dots on a computer monitor, it was then Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

Leading Solution Providers

Contributors

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Intel’s Next-gen Falcon Shores Coming Out in Late 2025 

April 30, 2024

It's a long wait for customers hanging on for Intel's next-generation GPU, Falcon Shores, which will be released in late 2025.  "Then we have a rich, a very Read more…

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's l Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

June 10, 2024

Nvidia had an explosive 2023 in data-center GPU shipments, which totaled roughly 3.76 million units, according to a study conducted by semiconductor analyst fir Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

AMD Clears Up Messy GPU Roadmap, Upgrades Chips Annually

June 3, 2024

In the world of AI, there's a desperate search for an alternative to Nvidia's GPUs, and AMD is stepping up to the plate. AMD detailed its updated GPU roadmap, w Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire