One major barrier to unlocking future system performance is overcoming the bandwidth bottlenecks that limit inter-processor and memory performance. Today’s high-performance computing (HPC) systems must process massive amounts of data, and the growth of data will only increase in the exascale computing era. There have been major advances in processors and accelerators (XPUs) used to speed and optimize processing data from scientific and AI application workloads. Large HPC systems currently have an architecture where compute and memory resources are tightly coupled.
There is a growing disparity of speed between the XPU and memory outside the XPU package, often caused by the limited communication bandwidth of interconnect technologies. Current memory solutions, such as HBM and DDR5, are constrained because of thermal and signal integrity issues. Lawrence Livermore National Laboratory (LLNL) recently performed a study on four large HPC clusters to test memory utilization and concluded that, “Our results show that more than 90% of jobs utilize less than 15% of the node memory capacity, and for 90% of the time, memory utilization is less than 35%.”
Breaking the Memory Bottleneck Through Disaggregation
Future HPC and AI system architectures may look to disaggregate or decouple resources as the answer to some of these memory challenges. One approach is to disaggregate XPUs and memory into separate physical entities. This approach creates a single memory pool allowing XPUs to access bigger pools of resources such as shared DRAM. For example, the system could connect cores to memory as requests arrive. Applications could use all memory available across an entire data center instead of being confined to the memory of a single server.
Solving the Bandwidth Bottlenecks Using Optical I/O
Disaggregating and pooling system components raises technical issues such as how to manage workflows around such systems, as well as the fundamental technical problems of moving data between these components. Enabling these new flexible system architectures will require high-bandwidth, low-latency interconnects.
A transition to photonics (or optical I/O) enables memory to be pooled with low latency and high performance. A variety of protocols are considering using optical I/O to enable system scalability. One example is Compute Express Link (CXL), an emerging unified protocol for disaggregated systems that uses PCIe electrical signaling for I/O interconnect. “Optical I/O is expected to be the foundation of new interconnects that will allow heterogenous connectivity, with tremendous bandwidth, low latency and low power, across a range of new system designs,” states Vladimir Stojanovic, Chief Architect & Co-Founder, Ayar Labs.
Join us for the Advanced Memory Architectures to Overcome Bandwidth Bottlenecks for the Exascale Era of Computing webinar on November 10 at 9:00 am PT. During this webinar, leading industry experts will discuss their insights on what the future has in store for advanced memory architectures, exciting new optical I/O solutions using silicon photonics, and the technologies and environments needed to make next generation performance a reality.
Industry analyst Addison Snell of Intersect360 will lead the discussion with experts from industry, as well as US supercomputing national laboratories, in this webinar panel:
- Mohamad El-Batal, CTO Cloud Systems, Seagate
- William Magro, Chief Technologist, High-Performance Computing, Google
- Ivy Peng, Computer Scientist, Lawrence Livermore National Laboratory
- Vladimir Stojanovic, Chief Architect & Co-Founder, Ayar Labs
- Marten Terpstra, Sr Director, PLM & Business Development, High Performance Networking and Silicon Photonics, HPE
For more information on next-generation solutions to memory bottlenecks, join this upcoming webinar.