HPCwire’s Managing Editor sits down with Jack Dongarra, Top500 co-founder and Distinguished Professor at the University of Tennessee, during SC21 in St. Louis to discuss the latest Top500 list, the outlook for global exascale computing and what exactly is going on in that Viking helmet photo. Plus what’s in store for 2022.
Transcript (lightly edited):
Tiffany Trader: Hello, I’m Tiffany Trader, Managing Editor, HPCwire. And here with me at SC21 in St. Louis is Jack Dongarra. Jack Dongarra needs no introduction. But he is the Top500 co-founder, he’s the originator of the Linpack benchmark, and he is also a distinguished professor with the University of Tennessee and has affiliations with Oak Ridge National Laboratory and the University of Manchester. So, Jack, we’re here at SC21. How many SCs does that make for you?
Jack Dongarra: Hello. I’m one of a select group of people who have been to all of the SCs. SC started in 1988. And there are 18 of us who have been to all of the conferences. This year is a little unusual in the sense that not all of us are here in person. Some of them are here virtually. But we’ve been to every every one of them since the since the beginning in 1988.
Trader: And you all have the designation of perennials…
Dongarra: And we have the designation of perennials, that’s right. So that’s our claim to fame.
Trader: Speaking of Supercomputings from years past, there is an iconic photo of you from a little little while ago in a Viking helmet, and I’ve always wondered what the backstory is on that.
Dongarra: So that was during the Supercomputing when it was held in Minnesota in Minneapolis [SC92]. It was in November, of course; and it was freezing cold. And there was a party that Intel threw, and part of the party was to give away those helmets. And I still have that helmet; it’s used by my grandchildren now, and they do enjoy it.
The Intel party was a little bit unusual in a number of ways. One way was they had hired the Viking cheerleaders, NFL Viking cheerleaders to come in and perform at the party, and that was probably not the most politically correct thing to have happen. And the other memorable thing that sticks in my mind is one of the parties, maybe the exhibitor party, was held at Prince’s club in Minneapolis. There was a lot of loud music, a lot of drinking, a lot of things going on. At about one o’clock, I decided to leave Prince’s club, and as I was leaving the entrance that we were using, somebody told me to step aside. And I was a little shocked that I was being asked that. So, a car pulls up and Prince jumps out and goes into his club.
Trader: I think a lot of people will be jealous of that story. That’s a pretty good story. Another important title that you’re wearing this year, and for this show is that you are you are the keynote chair. And we had a really nice plenary on Monday night. It was the HPC, AI and ethics plenary chaired by Dan Reed. And then there was the keynote this morning with Vint Cerf, the father of the internet who is also the Internet Evangelist for Google. So, I mean, how did you pick those folks?
Dongarra: Well, this year, I was asked by Bronis [de Supinski, SC General Chair] to be part of the team that put together the conference. I’m on the executive committee, and my responsibility this year is to drive the keynote, to choose a keynote speaker. And part of that responsibility is also to put together this panel. That would be the plenary panel for SC. I’ve known Dan Reed for many years. He was the chair of that. He suggested a bunch of names. We wanted to make sure that the panel was in-person, we didn’t want to have virtual panelists. So we it did take a little bit of struggle to put together. But we had a terrific panel. I think it’s the first time we had an astronaut on the panel. We had a lawyer, we had an MD, we had a scientist. So it made for a very rounded group of people to talk about ethics and the impact on supercomputing and what perhaps the future might look. So I thought was very interesting panel, very engaging.
This morning, we had our keynote from Vint Cerf. I’ve known Vint for many years. And when asked to do it, he very graciously said he would. I had heard him speak before about a number of things, and suggested that maybe he can give a talk because of the theme of this conference relating to ‘science and beyond’ and humanities and so on. That he could give a talk that I had heard him give at the University of Tennessee, that was related to what he talked about today. I thought it was a great talk. He was able to get things at the right level, to capture the interest of the audience. From where I was sitting, it looked like the whole auditorium was full, which I was very impressed with. We were concerned about that. And of course, because of the pandemic that it wouldn’t be, but I think Vint being a very known name, he was able to attract a very large crowd. And I think he did very well in this presentation today. I think a lot of people enjoyed it.
Trader: Yes, it was nicely attended, as was the Intersection of Ethics and HPC panel, which, as you referenced, had a really rich and distinct distinguished set of panelists. And because of the hybrid nature of this event, those were livestreamed to all of the attendees. And they will also be made available on demand. So if you haven’t seen them yet, I really recommend people check them out.
Dongarra: Absolutely. So they’re roughly 45 minutes to an hour long. And there are online as we say, and they are worth listening to, to capture the spirit of what was talked about.
Trader: So one of the flagship events of the SC and then the European counterpart, ISC, is the unveiling of the twice yearly Top500 list. And this one was no exception. We were in the Top500 press briefing yesterday on Monday with you and the other list authors and the Green500 list author. If you’d like we can dive a little into the list in a minute. And you can you can see our coverage on HPCwire from more of the feeds and speeds and the full coverage of that. But what I thought it was, again, for this hybrid event, it was, I wouldn’t say surprisingly, but it was it was very rich and robust. And it was a good… I mean, I’ve been to many of these, and it was a good one.
Dongarra: It was a good one.
Trader: It was a little different… the format, we had questions coming in from the people who were livestreaming in. And so they had some interesting questions. And I asked some questions too. So these are some of the questions that came up. We were talking about the different systems and the different geopolitical considerations with China specifically holding back systems and, you know, that brings up the relevancy or the usefulness of the list, or what are the implications for the list if some big players are not on there.
Dongarra: That’s right. So yeah, let me just start by saying, this year’s list is not terribly exciting at the top end, anyway, there doesn’t seem to be many new machines at the top end. In the top 10, there’s one new new entry in the list…
Trader: Number 10.
Dongarra: Number 10.
Trader: Fugaku is still on top.
Dongarra: Fugaku is still on top. Yeah. So that’s a very impressive machine. It’s the fourth time that it’s been number one on the list.
Trader: Cheers to Satoshi Matsuoka.
Dongarra: Satoshi Matsuoka deserves a lot of credit for that architecture, and that machine and how it’s run. And we were hoping that there would be new entries, which would be at the exascale point. We know that there’s a machine being put together at Oak Ridge. I’ve seen it; it’s there.
Trader: They stole one of the nodes and put it on the show floor. It’s in the AMD booth.
Dongarra: I didn’t know that… the machine isn’t quite ready yet. It’s a big machine. It’s a challenge of course, whenever you assemble a machine like that at scale, to bring it up for the first time. And I think they’re going through some of those early stages and trying to get this… trying not only to get the hardware up, but also to get the software working and that that presents some of the challenges that they’re facing. And that’s not unusual; that happens for all machines. The timing is such that they couldn’t get the entry in in time. I’m sure that they will have one in for June, for the June list. So we’ll see an exascale machine there.
China is reported to have at least two machines which are at exascale. And that comes from the rumors, of course, we haven’t seen the results. And that comes from good sources, though. So the question might be: why haven’t those machines been introduced into the list? Why haven’t they announced those machines? Why are they holding back? And you know there’s probably a number of reasons that we can give for that. One reason could be related to China not wanting to upset some kind of balance that’s in place now with the U.S. government. And I probably feel that that’s the reason that China’s a little concerned about coming up with technology that supersedes the technologies that are in the U.S. [Their] exascale machine would be probably five times faster than the machine at Oak Ridge today [i.e. Summit]. And that could cause some people to react in a way that might upset China. It’s unclear where China makes its parts, fabs its chips. They probably do it at TSMC, so Taiwan is a source for their chips. And the U.S. government may react in some way against that.
So that’s one argument I think that could be made for why they haven’t disclosed the machines. Those machines are not secret. There’s an entry in the Gordon Bell Prize, which was run on one of those machines. And that’s a very impressive result that they have. The architecture is described in some detail. So we got a good understanding of what that machine looks like. And it looks to be a very powerful system for solving high performance computing problems. You know, sometimes people think of machines that are put together and appear number one as being a stunt machine that was put together just to get a Linpack number. That was said about the machine from Wuxi, the Sunway machine when it first came out. But the reality is: it’s a very powerful machine, it is being used for science, they were able to win Gordon Bell Prizes with it. And I think that demonstrates that the machine is not just a one-off stunt to do number one in the Top500. So I feel that the Chinese machines when they are announced officially… So what does it mean to be announced? Well, in order to get into the Top500 list, they have to submit. They have to actively submit something to us, which shows the benchmark results and sort of proves that they achieved a certain level of performance. It’s simple to do, they click on a website, they upload a file, and then it becomes part of the Top500 list the next time that list is released. And they have not done that at this point. So we’re waiting for it, we’re hoping for it. Maybe in June, we’ll see three machines at exascale. That would really be a substantial change to the list, in the sense that the current number-one machine on the list is 440 petaflops. If we have three machines at exaflop, that would bump it way up. And so that would be an exciting time as we look at that, for high performance computing, showing that the progression is still there. Right now it looks rather as if it’s in the doldrums, it’s very, very quiet.
Trader: Yeah, the list has some natural steps in it. But that will be the largest step.
Dongarra: It will be a big jump.
Trader: A big jump… You offered a lot of really interesting information there. And one of the things that caught caught my attention was this pivot that we are seeing [from China] away from the Top500. But then they redirected those efforts into the Gordon Bell list. And that’s, that’s actually given some interesting research results, some nice research.
Dongarra: And I think that’s a great thing. We need many kinds of metrics to understand how these machines can be used and what benefit they can provide, and what levels we can expect for applications. So showing the Top500 is one. Showing the Gordon Bell Prize certainly exhibits the fact that they can solve real problems on a machine at scale, doing things which cannot be done on any other machine. You know, that really shows the importance of that scientific instrument. And, you know, I view these supercomputers as scientific instruments, much like the Hubble Telescope, or the Webb Telescope, you know, something that is unique. It provides tremendous insight for people that get to use it. It provides the opportunity to push back the frontiers of science when it’s used appropriately. And that’s that’s really what supercomputing does.
Trader: Well said. And another thought I had: as you said, there’s potentially two exascale machines coming up on the list from the U.S., Oak Ridge and Argonne National Laboratories. And then we’ll see what China ends up doing. You know, that could be four. In the U.S., there’s kind of an interesting consideration. So Intel and Argonne, they announced a couple two, three weeks ago, that they’re basically essentially doubling the size of the forthcoming Aurora system. So I think what some folks are looking at now is what the timeline will be. And if there’s a potential even though, you know, Oak Ridge had the headstart with an installation on the floor, if somehow they could they could come in on the same list, which would be November 2022 [I meant June -tt]. With potentially Argonne coming in above.
Dongarra: I think it’d be great to see the Oak Ridge and the Argonne machine emerge next year, I think that would be good. You know, I have my question/doubts, whether or not the Argonne machine will make it. But, you know, I think that’s certainly in the possible realm of possibilities there. I’m pretty confident that the Oak Ridge machine will be on the list though. That machine is in place, running. And it’s just a question of tuning.
Trader: Yeah. And I mean, it’s just sort of speculation, but it’s possible that the Argonne would be at half size, and then they wouldn’t be to the full size. I don’t have any knowledge specifically of that. But it’s a reasonable thing that happens often where there’s sort of a phase one, and a phase two. That’s very common.
Dongarra: And then there’s another machine too, the Livermore machine, coming a little bit later.
Trader: Yes. Coming around the corner.
Dongarra: Yep. So three big machines.
Trader: That’s pretty exciting. Talking about geopolitic[s] and the new systems, there are four additional new systems that were interesting. There were four Russian systems as well that came into the latest list. And I think all four of them were within the top 50, mostly used for hyperscale and cloud. I don’t have the speeds and feeds at hand, but they were pretty significant, and had decent interconnects.
Dongarra: Decent interconnect. I think they’re Nvidia based systems, A100 accelerators being used. I think the official website says something about AI based applications, cloud things, of course, coming from that context. And yeah, I don’t have any more details about the exact placement. They appear to be commodity based systems in that sense. There doesn’t seem to be anything that was specially customized. I’m sure there’s a lot of software that has been customized. But I don’t think that there’s any customized hardware going into that.
Trader: Any other thoughts on the list or highlights from the show?
Dongarra: Well, I haven’t been able to walk around the show floor here. It’s a booming place, not quite as large as it has been in the past. But a lot of vendors are here. Some vendors are not. But there seems to be enough activity here that it resembles a regular show, I’m reminded that this conference this year is about the same size as the conference held in Germany, the ISC conference, maybe a little bit bigger now. I think the numbers are 3,500 people, I think I heard said this morning. And I think the German conference has maybe about 2,000 people.
Trader: In-person attendees?
Dongarra: Yeah. So 3,500 in person here in St. Louis. And it looks like it’s a valid, thriving show. There’s been some hiccups with some of the virtual nature of things, links not working correctly, but I’m sure we’ll sort it out. This is the second day of the conference.
Trader: And the other thing is SCinet was really good here. Raised floor. I thought that was pretty cool. That’s pretty appropriate and on brand.
Dongarra: Yeah, on brand. So of course, those guys do a remarkable job. They come in a week ahead of time, maybe two weeks ahead of time, and get everything set up, put in the network, put in all the wiring, put in everything that’s needed to make this event, the wireless stuff. And then they have to tear it down in a day. So it’s just incredible.
Trader: And all volunteer.
Dongarra: Yeah, all volunteer. It’s just a remarkable story.
Trader: Yeah. So you recently revealed some changes coming down the path in your in your career and some retirement plans. You want to tell us a little bit more about that?
Dongarra: I announced that I’m going to retire. I’ll retire in the summer. I’ve been at the University of Tennessee for 32 years. And I’ll be 72, I guess, at that point. So it’s time for me to step down and transition in some sense. So I’ll be an emeritus professor at the University of Tennessee. What does that mean? Well, emeritus means I have no teaching duties. I have no committee work. I can still have an office. I can still run a research group. I can still come in whenever I want. I think I get a free parking spot. I don’t get any pay, and I will be able to be co-PI on grants. So there’s a lot of things that will remain the same.
I have a group in Tennessee: the Innovative Computing Laboratory. My group has 45 people, and that’s composed of research professors, postdocs, graduate students, programmers, and some administrative staff; and that group with 45 people are on soft money is what we call it. So soft money means we get grants, and that’s how they get paid, and if we don’t get a grant, there may not be money to pay people. So we strive very hard. We have a pretty good system for putting in place grants, and we have a good success rate as well, not 100 percent, but enough to sustain us in our in our activities. My hope is that we when I leave, officially step down as distinguished professor, we’ll be able to hire someone to come in and fill the position. The MathWorks Corporation has given us a donation. So we can have a named professor, the MathWorks professor in scientific computing, and I’m hoping that individual will come and help run the activities that have been put in place over the last 32 years.
So I’m looking forward to retirement and stepping down from some of the jobs that I don’t like to do, and maybe continuing on with the jobs I enjoy doing that are what I consider to be my hobbies. One of my hobbies is benchmarking. So I’m not going to abandon ship. In terms of the benchmarks, we have the Top500, we have the HPCG benchmark, we have the HPL-AI benchmark, and there are other things to that we’re dabbling with. So there’s a number of things that will continue. Given more free time, I’ll probably be able to help out those things a little bit more.
Trader: Even more, yeah. So you’ll be maintaining your your benchmarking responsibilities and activities, maybe putting even a little bit more focus into those. And, I can probably expect to see you next year as part of that the press briefing for the Top500. That show is in Dallas; SC22 is in Dallas. And will you be there?
Dongarra: Hey, I’m a perennial.
Trader: There you go. I look forward to next year. Thanks for joining us.
To view more HPCwire exclusive SC21 video interviews, including our joint interview with Raja Koduri and Satoshi Matsuoka, go here. Interviews with SiPearl and Preferred Networks will be out soon.