Royalty-free stock illustration ID: 1675260034

Solving Heterogeneous Programming Challenges with SYCL

December 8, 2021

In the first of a series of guest posts on heterogenous computing, James Reinders, who returned to Intel last year after a short “retirement,” considers how SYCL will contribute to a heterogeneous future for C++. Reinders digs into SYCL from multiple angles... Read more…

15 Slides on Programming Aurora and Exascale Systems

May 7, 2020

Sometime in 2021, Aurora, the first planned U.S. exascale system, is scheduled to be fired up at Argonne National Laboratory. Cray (now HPE) and Intel are the k Read more…

DARPA Looks to Propel Parallelism

September 4, 2019

As Moore’s law runs out of steam, new programming approaches are being pursued with the goal of greater hardware performance with less coding. The Defense Advanced Projects Research Agency is launching a new programming effort aimed at leveraging the benefits of massive distributed parallelism with less sweat. Read more…

An Overview of ‘OpenACC for Programmers’ from the Book’s Editors

June 20, 2018

In an era of multicore processors coupled with manycore accelerators in all kinds of devices from smartphones all the way to supercomputers, it is important to Read more…

GTC18 Research Highlight: Programming a Hybrid CPU-GPU Cluster Using Unicorn

March 27, 2018

Unicorn is a parallel programming framework that provides a simple way to program multi-node clusters with CPUs and GPUs, and potentially other compute devices. Read more…

Optimizing Codes for Heterogeneous HPC Clusters Using OpenACC

July 3, 2017

Looking at the Top500 and Green500 ranks, one clearly realizes that most HPC systems are heterogeneous architecture using COTS (Commercial Off-The-Shelf) hardware, combining traditional multi-core CPUs with massively parallel accelerators, such as GPUs and MICs. With processor frequencies now hitting a solid wall, the only truly open avenue for riding today the Moore’s law is increasing hardware parallelism in several different ways: more computing nodes, more processors in each node, more cores within each processor, and longer vector instructions in each core. Read more…

LOLCODE: I Can Has Supercomputer?

April 5, 2017

What programming model refers to threads as friends and uses types like NUMBR (integer), NUMBAR (floating point), YARN (string), and TROOF (Boolean)? That would Read more…

MIT’s Multicore Swarm Architecture Advances Ordered Parallelism

July 21, 2016

A relatively new architecture explicitly designed for parallelism – Swarm – based on work at MIT has shown promise for substantially speeding up classes of Read more…

  • arrow
  • Click Here for More Headlines
  • arrow


Porting CUDA Applications to Run on AMD GPUs

Giving developers the ability to write code once and use it on different platforms is important. Organizations are increasingly moving to open source and open standard solutions which can aid in code portability. AMD developed a porting solution that allows developers to port proprietary NVIDIA® CUDA® code to run on AMD graphic processing units (GPUs).

This paper describes the AMD ROCm™ open software platform which provides porting tools to convert NVIDIA CUDA code to AMD native open-source Heterogeneous Computing Interface for Portability (HIP) that can run on AMD Instinct™ accelerator hardware. The AMD solution addresses performance and portability needs of artificial intelligence (AI), machine learning (ML) and high performance computing (HPC) for application developers. Using the AMD ROCm platform, developers can port their GPU applications to run on AMD Instinct accelerators with very minimal changes to be able to run their code in both NVIDIA and AMD environments.

Download Now

Sponsored by AMD


QCT HPC BeeGFS Storage: A Performance Environment for I/O Intensive Workloads

A workload-driven system capable of running HPC/AI workloads is more important than ever. Organizations face many challenges when building a system capable of running HPC and AI workloads. There are also many complexities in system design and integration. Building a workload driven solution requires expertise and domain knowledge that organizational staff may not possess.

This paper describes how Quanta Cloud Technology (QCT), a long-time Intel® partner, developed the Taiwania 2 and Taiwania 3 supercomputers to meet the research needs of the Taiwan’s academic, industrial, and enterprise users. The Taiwan National Center for High-Performance Computing (NCHC) selected QCT for their expertise in building HPC/AI supercomputers and providing worldwide end-to-end support for solutions from system design, through integration, benchmarking and installation for end users and system integrators to ensure customer success.

Download Now

Sponsored by QCT

Advanced Scale Career Development & Workforce Enhancement Center

Featured Advanced Scale Jobs:

Receive the Monthly
Advanced Computing Job Bank Resource:

HPCwire Resource Library

HPCwire Product Showcase

Subscribe to the Monthly
Technology Product Showcase: