Designed to push the frontiers of computing chip and systems performance optimized for AI workloads, an 8 petaflops (Linpack) IBM Power9-based supercomputer has been unveiled in upstate New York that will be used by IBM data and computer scientists, by academic researchers and by industrial and commercial end-users.
Installed at the Rensselaer Polytechnic Institute Center for Computational Innovations (CCI), the system — called AiMOS (Artificial Intelligence Multiprocessing Optimized System) – was the most powerful to debut on last month’s Top500 supercomputer ranking, it’s listed as the world’s 24th most powerful computer, the most powerful to be housed at a private university and – according to the Green500 listing – the third most energy efficient. It was built using the same IBM Power Systems technology as the Top500’s nos. 1 and 2, systems, the US Dept. of Energy’s IBM Summit and Sierra supercomputers, based on IBM Power9 CPUs and Nvidia GPUs.
Named for Rensselaer co-founder Amos Eaton, AiMOS is the result of a collaboration between IBM, RPI and two New York state programs, Empire State Development (ESD) and NY CREATES. The machine will serve as a test bed for the IBM Research AI Hardware Center, which opened on the SUNY Polytechnic Institute (SUNY Poly) campus in Albany earlier this year.
The new system is the third in a series of increasingly powerful IBM supercomputers at RPI, the first a 100 teraflop IBM Blue Gene installed in 2007, the second an IBM Blue Gene/Q petascale system installed six years ago, according to Christopher D. Carothers, director, Center for Computational Innovations and a professor in RPI’s Department of Computer Science, who told us the new system is 12x faster than the Blue Gene/Q.
AiMOS is comprised of 252 compute nodes with a total of 504 IBM Power9 processors and 1,512 Nvidia Volta GPUs. The system has 126 terabytes of system memory, more than 400 terabytes of high speed, local, solid state storage and a Mellanox fat-tree network offering 6 TB/second aggregate network bandwidth.
“The architecture is very well matched for current and future AI and machine learning algorithms as well as advanced scientific computing methods,” said Carothers. “So essentially we can do all of these applications in the same system.”
On the software side, he said AiMOS offers an important advantage resulting from recent IBM M&A activity.
“One of the problems in the past with IBM systems is not all the software would run on them,” he said. “But (AiMOS) is going to run the world’s most ubiquitous open source Linux based operating system, Red Hat, which is now owned by IBM. And so when we bring the data-centric architecture together with Red Hat, it’s going to enable essentially the widest possible range of AI, machine learning and data analytics applications that are currently available. So essentially, very little open source software will not be able to execute, which is sort of a problem, I’d say with past supercomputer systems.”
The result, he said, is that AiMOS will complete neural network training jobs in minutes or hours that formerly required weeks or months. An anticipated impact of this capability, Carothers said, is to “move away from thinking about what we can do at a single focused ‘hero run,’ but instead think about … the whole ensemble of computations that work together in an integrated, cohesive manner. And this is going to enable even a much higher level of solving problems.” This, he said, directly relates to exploration of new AI, machine learning and accelerator hardware design.
“So we want to really think about what are the algorithms doing? And are there pieces to these AI algorithms that we can really think about putting into hardware? Where AiMOS comes into play is we don’t just have to make the hardware, we can begin to simulate it and emulate it on the test bed directly in advance of it actually being fabricated. And oh, by the way, once it’s fab, it could be actually installed at our facility and we can then begin to test it on real research as well as other partner workloads that will be executing within the center.”
IBM said corporate members of its AI Hardware Center include Samsung, Applied Materials and Synopsys, as well as public entities, such as RPI and SUNY POLY and other members of the SUNY family.
“Computer artificial intelligence, or more appropriately, human augmented intelligence (AI), will help solve pressing problems, from healthcare to security to climate change,” said Dr. John E. Kelly III, IBM EVP. “In order to realize AI’s full potential, special-purpose computing hardware is emerging as the next big opportunity. IBM is proud to have built the most powerful and smartest computers in the world today… Our collective goal is to make AI systems 1,000 times more efficient within the next decade.”