EU Projects Unite on Heterogeneous ARM-based Exascale Prototype

By Tiffany Trader

February 24, 2016

A trio of partner projects based in Europe – Exanest, Exanode and Ecoscale – are working in close collaboration to develop the building blocks for an exascale architecture prototype that will, as they describe, put the power of ten million computers into a single supercomputer. The effort is unique in seeking to advance the ARM64 + FPGA architecture as a foundational “general-purpose” exascale platform.

Funded for three years as part of Europe’s Horizon2020 program, the partners are coordinating their efforts with the goal of building an early “straw man” prototype late this year that will consist of more than one-thousand energy-efficient ARM cores, reconfigurable logic, plus advanced storage, memory, cooling and packaging technologies.

Exanest is the project partner that is focused on the system level, including interconnection, storage, packaging and cooling. And as the name implies, Exanode is responsible for the compute node and the memory of that compute node. Ecoscale focuses on employing and managing reconfigurable logic as accelerators within the system.

Exanest

Manolis Katevenis, the project coordinator for Exanest and head of computer architecture at FORTH-ICS in Greece, explains that Exanest has set an early target of 2016 to build this “relatively-large” first prototype, comprised of at least one-thousand ARM cores.

He says, “We are starting early with a prototype based on existing technology because we want system software to be developed and applications to start being ported and tuned. For the remainder of the two years, there will be ongoing software development, plus research on interconnects, storage and cooling technologies. We also believe that there will be new interesting compute nodes coming out from our partner projects and we will use such nodes.”

In discussing target workloads, Katevenis emphasizes flexibility and breadth, echoing the sentiments we are hearing from across the HPC community. The goal for this platform is to be able to support a range of applications, both on the traditional compute and physics side and the data-intensive side. A look at the Exanest partner list hints at the kind of high-performance applications that will be supported: astrophysics, nuclear physics, simulation-based engineering, and even in-memory databases with partner MonetDB Solutions. Allinea will be providing the ARMv8 profiling and debugging tools.

Although the projects are still in the specification phase, they will be making selections with the aim of overcoming the specific challenges related to exascale. Areas of focus include compact packaging, permanent storage, interconnection, resilience and application behavior. Some of the design decisions were revealed in this poster from Exanest that shows a diagram of the daughterboard and blade design. Note that Xilinx is a key partner.

Exanest daughterboard and blade design

To achieve a complete prototype capable of running real-world benchmarks and applications by 2018, the primary partners are collaborating with a number of other academic groups and industry partners using co-design principles to develop the hardware and software elements. This is a classic public-private arrangement where academic and industrial partners join forces and industrial partners benefit by being able to reuse the technology that is developed.

On the technology side, packaging and cooling is a key focus for Exanest, which will rely on Iceotope, the immersive cooling vendor, to design an innovative cooling environment. The first prototype will employ Iceotope technology and there is the expectation that technology with even higher power density will be developed as the project progresses.

One of the primary criteria for the project partners is low-energy consumption for the main processor. They have chosen 64-bit ARM processors as their main compute engine. Katevenis affirms that having a processor that consumes dramatically less power allows many more cores to be packaged in the same physical volume and within the same total power consumption budget. “One way we will achieve scale is this low-power consumption,” says the project lead, “but another is by having accelerators to provide floating point performance boost to appropriate applications.”

As for topology, the Exanest team is discussing the family of networks that includes fat trees and Dragonfly topology. They will be linking blades through optical fibers that they can plug and unplug allowing them to experiment with more than one topology. Exanest will also be using FPGAs for building the interconnection network so they can experiment with novel protocols.

Exanode

Denis Dutoit, the project coordinator for Exanode, tells HPCwire the goal of that project is to build a node-level prototype with technologies that exhibit exascale potential. The three building blocks are heterogeneous compute elements (ARM-v8 low-power processors plus various accelerators, namely FPGAs although ASICs and GPGPUs may also be explored); 3D interposer integration for compute density; and, continuing the efforts of the EUROSERVER project, an advanced memory scheme for low-latency, high-bandwidth memory access, scalable to exabyte levels.

ExaNoDe_Figures_04-1024x768

Dutoit, who is the strategic marketing manager, architecture, IC design and embedded software division at CEA-Leti, notes that this is a technology driven project at the start, but on top of this prototype, there will be a complete software stack for HPC capability. Evaluation will be done first will be done on the node level, explains Dutoit. They will utilize emulated hardware first and representative HPC applications to evaluate at the level nodes, but after that, Exanest will reuse these compute nodes and integrate them into their complete machine to do the full testing and evaluation with real applications.

There will be a formal effort to productize the resulting technology through a partnership with Kaleao, a UK company that focuses on energy-efficient, compact hyperconverged platforms.

Ecoscale

Iakovos Mavroidis, project coordinator for Ecoscale, says that while there are three main projects, he sees it as one big project with Ecoscale dedicated to reconfigurable computing.

A member of Computer Architecture and VLSI Systems (CARV) Laboratory of FORTH-ICS and a member of Telecommunication Systems Institute, Mavroidis notes that the main problem being addressed is how to improve today’s HPC servers. Simple scaling without improving technologies is unfeasible due to utility costs and power consumption limitations. Ecoscale is tackling these challenges by proposing a scale-out hybrid MPI+OpenCL programming environment and a runtime system, along with a hardware architecture which is tailored to the needs of HPC applications. The programming model and runtime system follows a hierarchical approach where the system is partitioned into multiple autonomous workers (i.e. compute nodes).

ecoscale_framework

“The main focus of Ecoscale is to support shared partitioned reconfigurable resources, accessed by these compute nodes,” says Mavroidis. “The intention is to have a global notion of the reconfigurable resources so that each compute node can access remote reconfigurable resources not only its own local resources. The logic can also be shared by several compute nodes working in parallel.” To accomplish this, workers are interconnected in a tree-like structure in order to form larger Partitioned Global Address Space (PGAS) partitions, which are further hierarchically interconnected via an MPI protocol.

“The virtualization will happen automatically in hardware and it has to be done because reconfigurable resources are very limited unless remote access is enabled,” states Mavroidis. “The aim is to provide a user-friendly way for the programmer to use all the reconfigurable logic in the system. This requires a very high-speed low-latency interconnection topology and this is what Exanest will provide.”

Mavroidis explains there must be means for the programmer to access the system and at a higher-level the run-time system has to be redefined to understand the needs of the application so it can reconfigure the machine. He believes that in order to fully implement this, there will need to be innovation in all the layers of the stack, and also the programming model itself will also need to be redefined. The partners are aiming to support most of the existing and common HPC libraries in order to make this architecture available to most of the existing applications.

The main focus of Ecoscale is to automate out the complexity of FPGA programming. Anyone who has watched FPGAs struggle to get a foothold in HPC knows this is not an easy task, but the need for low-power performance is driving interest and innovation. “The programmer should not have to be aware that the machine uses reconfigurable computing, but rather be able to write the program using high-level programming model such as MPI or Standard C,” states Mavroidis.

On a related note, Exanest project partner BeeGFS has just announced that the BeeGFS parallel file system is now available as open source from www.beegfs.com. “Although BeeGFS can already run out of the box on ARM systems today, this project [Exanest] will give us the opportunity to make sure that we can deliver the maximum performance on this architecture as well,” shares Bernd Lietzow, BeeGFS head for Exanest.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

2024 Winter Classic: Texas Two Step

April 18, 2024

Texas Tech University. Their middle name is ‘tech’, so it’s no surprise that they’ve been fielding not one, but two teams in the last three Winter Classic cluster competitions. Their teams, dubbed Matador and Red Read more…

2024 Winter Classic: The Return of Team Fayetteville

April 18, 2024

Hailing from Fayetteville, NC, Fayetteville State University stayed under the radar in their first Winter Classic competition in 2022. Solid students for sure, but not a lot of HPC experience. All good. They didn’t Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use of Rigetti’s Novera 9-qubit QPU. The approach by a quantum Read more…

2024 Winter Classic: Meet Team Morehouse

April 17, 2024

Morehouse College? The university is well-known for their long list of illustrious graduates, the rigor of their academics, and the quality of the instruction. They were one of the first schools to sign up for the Winter Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pressing needs and hurdles to widespread AI adoption. The sudde Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire