OpenACC Expands Community, Reveals Roadmap Details

By Tiffany Trader

November 7, 2016

In advance of the SC16 expo in Salt Lake City next week, the OpenACC standards group today welcomed newest member NSSC-Wuxi and highlighted a number of important developments for the directives-based programming standard. Ahead of the announcement, HPCwire spoke with Michael Wolfe, technical director of OpenACC, and Duncan Poole, OpenACC president and director of platform alliances for accelerated computing at Nvidia.

OpenACC (Open Accelerators) was developed by Cray, CAPS, Nvidia and PGI circa 2011. The standard was designed to simplify parallel programming of heterogenous CPU-GPU machines and has since added support for additional multicore/manycore platforms, while maintaining code portability.

OpenACC’s newest member is the National Supercomputing Center (NSCC) in Wuxi, China, home to the TaihuLight Sunway system, which made its grand TOP500 entrance at ISC 2016, pushing the LINPACK record to 93 petaflops. NSCC-Wuxi runs a custom version of OpenACC developed for the Sunway system’s 260-core Chinese-made processor.

“The OpenACC paradigm was chosen for its better fit to our many-core processor, with a few extensions to better support the efficient utilization of the new hardware features such as the Scratch Pad Memory for each core and DMA instructions,” said Dr. Haohuan Fu, deputy director of the National Supercomputing Center in Wuxi and associate professor Center for Earth System Science at Tsinghua University, in a prepared statement.

TaihuLight’s Sunway manycore processors are composed of four core groups; each core group has one management processing element (MPE) and 64 compute processing elements (CPEs) for a total of 260 cores per CPU. Says Wolfe, who is also a compiler engineer with PGI (Nvidia), “Essentially, there’s a control processor that runs the main application and offloads the parallel region to the compute elements. When they’re using OpenMP the offload model offloads the parallel part to the compute elements and the master thread goes on with the scalar part of the code. NSCC-Wuxi wanted that master thread to participate in the parallel work and thought that would be more natural with the OpenACC model.” Wolfe added that OpenMP also has a lot of synchronization constructs that are challenging to implement on the manycore architecture.

OpenACC was used to parallelize and tune one of three NSCC-Wuxi codes on the short-list to receive the prestigious Gordon Bell prize at SC16. CAM-SE is a “10 million core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics” that contains 530,000 lines of code.

A number of flagship HPC codes are also using OpenACC, notably Gaussian, widely-used in quantum chemistry, and ANSYS Fluent, the popular commercial CFD software. “We build and support Fluent on a wide variety of parallel computing systems, and we need to be able to write a single version of our source code that runs efficiently on all of those systems,” said Sunil Sathe, Fluent lead software developer. “With OpenACC, we were able to quickly enable a key solver for GPU acceleration while keeping the same code base for CPU execution. The OpenACC performance was excellent on NVIDIA GPUs and very competitive on CPUs.”

OpenACC is also being used by five of the thirteen application-readiness codes used to qualify the 200-petaflops Summit supercomputer that is going in at Oak Ridge Labs. “This shouldn’t be a surprise because the Oak Ridge Leadership Computing Facility is a big OpenACC user today,” said Duncan Poole, president of OpenACC and Nvidia executive.openacc-2015-2017

OpenACC also has production support for OpenPower, both multicore OpenPOWER and CPU + GPU implementations. Poole said that support for manycore Xeon CPUs (i.e., the Knight Landing Phi and follow-ons) is on track for 2017. The latter will be key for the Summit supercomputer, which will have some 3,400 nodes comprising multiple Power9 CPUs and multiple NVIDIA Volta GPUs, connected with Nvidia’s second-generation NVLink technology. For more information about how OpenACC is supporting the OpenPower architecture, see our June coverage.

The OpenACC roadmap

OpenACC also previewed features that will be added to its next release (2.6), being targeted for the middle of next year. One of the key features that’s been requested by users for the last couple years is “deep copy.” When we talked with Wolfe last year at this time, he said deep copy was being targeted for the 3.0 release, but now the standards body is planning a sort of interim step, to enable OpenACC to support a manual deep copy.

Wolfe explains, “This is where you have deeply nested data structures with pointers to other data structures that have pointers to other data structures and want to move the whole structure over to the device, which is a different memory space with different addresses and still keep the pointers valid.

“We’ve been struggling with a way to define this in manner that is seamless to use and still performant. We arrived at the decision to make a small change to the specification so that users can do a manual deep copy.”

Manual deep copy gives users the behavior that they want although it’s not as conventient as they would like, Wolfe commented. The standards group is looking for someone to do an implementation of the true deep copy before it is hardened into the specification. Wolfe wouldn’t speculate on a timeline: “If we can get a prototype implementation, our hope is that that may shake out potential problems, but we cannot predict how many of those there might me.”

Additional features planned for OpenACC include Device Query Routines, Error Callback Routine, Polymorphic Routine Compilation, Serial Compute Construct, and Array Reductions.

These are all highly requested by users, the actual people working on programs, said Wolfe.

openacc-2-6-proposed-functionality-slide

OpenACC doesn’t have a calendar-based release cadence. Instead, they collect requests and push out a new release when they have a critical mass to constitute a new release.

“What we want to work on is the big items, things like true deep copy or a seamless way to spread parallel regions across multiple devices, or load balancing across the GPU and the CPU and how do you manage that. Those are big items; those are what users really want,” said Wolfe.

He added, “Last summer I was visiting CSCS in Lugano, Switzerland, and each node of the cluster they host for the weather forecasting service MateoSwiss has four K80s, so eight GPUs per node. Well how do you manage that? Is it easy or is there a way to make it even easier, that’s a big challenge, and we’re ready to take it on.”

Community engagement and education

Via partnerships with Oak Ridge National Laboratory and other member orgs, OpenACC continues to offer hackathons around the world. More info is at http://www.openacc.org/hackathons. Oak Ridge and OpenACC will also be conducting a series of free two and half day workshops starting next year. These are designed to introduce developers to the framework and are a new addition to OpenACC’s training and education program.

SC16 activities include:

OpenACC Birds of a Feather, Wed. Nov. 16th 5:30–7:00PM in room 155-C. Discussion will include such topics as “Should OpenACC and OpenMP ever merge”.

Free “Parallel Programming with OpenACC” books will be signed by author Rob Farber Monday, November 14 from 7:00 to 9:00 pm in the OpenACC booth #634.

Bringing About HPC Open-Standards World Peace, Nov. 16th, 10:30 am 255-BC

Members will be available for questions in the OpenACC booth #634.

Visit http://www.openacc.org/sc16 for a list of all OpenACC member activities.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

GTC21: Dell Building Cloud Native Supercomputers at U Cambridge and Durham

April 14, 2021

In conjunction with GTC21, Dell Technologies today announced new supercomputers at universities across DiRAC (Distributed Research utilizing Advanced Computing) in the UK with plans to explore use of Nvidia BlueField DPU Read more…

The Role and Potential of CPUs in Deep Learning

April 14, 2021

Deep learning (DL) applications have unique architectural characteristics and efficiency requirements. Hence, the choice of computing system has a profound impact on how large a piece of the DL pie a user can finally enj Read more…

GTC21: Nvidia Launches cuQuantum; Dips a Toe in Quantum Computing

April 13, 2021

Yesterday Nvidia officially dipped a toe into quantum computing with the launch of cuQuantum SDK, a development platform for simulating quantum circuits on GPU-accelerated systems. As Nvidia CEO Jensen Huang emphasized i Read more…

Nvidia Aims Clara Healthcare at Drug Discovery, Imaging via DGX

April 12, 2021

Nvidia Corp. continues to expand its Clara healthcare platform with the addition of computational drug discovery and medical imaging tools based on its DGX A100 platform, related InfiniBand networking and its AGX develop Read more…

Nvidia Serves Up Its First Arm Datacenter CPU ‘Grace’ During Kitchen Keynote

April 12, 2021

Today at Nvidia’s annual spring GPU technology conference, held virtually once more due to the ongoing pandemic, the company announced its first ever Arm-based CPU, called Grace in honor of the famous American programmer Grace Hopper. Read more…

AWS Solution Channel

Volkswagen Passenger Cars Uses NICE DCV for High-Performance 3D Remote Visualization

 

Volkswagen Passenger Cars has been one of the world’s largest car manufacturers for over 70 years. The company delivers more than 6 million automobiles to global customers every year, from 50 production locations on five continents. Read more…

Nvidia Debuts BlueField-3 – Its Next DPU with Big Plans for an Expanded Role

April 12, 2021

Nvidia today announced its next generation data processing unit (DPU) – BlueField-3 – adding more substance to its evolving concept of the DPU as a full-fledged partner to CPUs and GPUs in delivering advanced computi Read more…

GTC21: Dell Building Cloud Native Supercomputers at U Cambridge and Durham

April 14, 2021

In conjunction with GTC21, Dell Technologies today announced new supercomputers at universities across DiRAC (Distributed Research utilizing Advanced Computing) Read more…

The Role and Potential of CPUs in Deep Learning

April 14, 2021

Deep learning (DL) applications have unique architectural characteristics and efficiency requirements. Hence, the choice of computing system has a profound impa Read more…

Nvidia Serves Up Its First Arm Datacenter CPU ‘Grace’ During Kitchen Keynote

April 12, 2021

Today at Nvidia’s annual spring GPU technology conference, held virtually once more due to the ongoing pandemic, the company announced its first ever Arm-based CPU, called Grace in honor of the famous American programmer Grace Hopper. Read more…

Nvidia Debuts BlueField-3 – Its Next DPU with Big Plans for an Expanded Role

April 12, 2021

Nvidia today announced its next generation data processing unit (DPU) – BlueField-3 – adding more substance to its evolving concept of the DPU as a full-fle Read more…

Nvidia’s Newly DPU-Enabled SuperPod Is a Multi-Tenant, Cloud-Native Supercomputer

April 12, 2021

At GTC 2021, Nvidia has announced an upgraded iteration of its DGX SuperPods, calling the new offering “the first cloud-native, multi-tenant supercomputer.” Read more…

Tune in to Watch Nvidia’s GTC21 Keynote with Jensen Huang – Recording Now Available

April 12, 2021

Join HPCwire right here on Monday, April 12, at 8:30 am PT to see the Nvidia GTC21 keynote from Nvidia’s CEO, Jensen Huang, livestreamed in its entirety. Hosted by HPCwire, you can click to join the Huang keynote on our livestream to hear Nvidia’s expected news and... Read more…

The US Places Seven Additional Chinese Supercomputing Entities on Blacklist

April 8, 2021

As tensions between the U.S. and China continue to simmer, the U.S. government today added seven Chinese supercomputing entities to an economic blacklist. The U Read more…

Habana’s AI Silicon Comes to San Diego Supercomputer Center

April 8, 2021

Habana Labs, an Intel-owned AI company, has partnered with server maker Supermicro to provide high-performance, high-efficiency AI computing in the form of new Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

Saudi Aramco Unveils Dammam 7, Its New Top Ten Supercomputer

January 21, 2021

By revenue, oil and gas giant Saudi Aramco is one of the largest companies in the world, and it has historically employed commensurate amounts of supercomputing Read more…

Quantum Computer Start-up IonQ Plans IPO via SPAC

March 8, 2021

IonQ, a Maryland-based quantum computing start-up working with ion trap technology, plans to go public via a Special Purpose Acquisition Company (SPAC) merger a Read more…

Leading Solution Providers

Contributors

Can Deep Learning Replace Numerical Weather Prediction?

March 3, 2021

Numerical weather prediction (NWP) is a mainstay of supercomputing. Some of the first applications of the first supercomputers dealt with climate modeling, and Read more…

Livermore’s El Capitan Supercomputer to Debut HPE ‘Rabbit’ Near Node Local Storage

February 18, 2021

A near node local storage innovation called Rabbit factored heavily into Lawrence Livermore National Laboratory’s decision to select Cray’s proposal for its CORAL-2 machine, the lab’s first exascale-class supercomputer, El Capitan. Details of this new storage technology were revealed... Read more…

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

African Supercomputing Center Inaugurates ‘Toubkal,’ Most Powerful Supercomputer on the Continent

February 25, 2021

Historically, Africa hasn’t exactly been synonymous with supercomputing. There are only a handful of supercomputers on the continent, with few ranking on the Read more…

The History of Supercomputing vs. COVID-19

March 9, 2021

The COVID-19 pandemic poses a greater challenge to the high-performance computing community than any before. HPCwire's coverage of the supercomputing response t Read more…

AMD Launches Epyc ‘Milan’ with 19 SKUs for HPC, Enterprise and Hyperscale

March 15, 2021

At a virtual launch event held today (Monday), AMD revealed its third-generation Epyc “Milan” CPU lineup: a set of 19 SKUs -- including the flagship 64-core, 280-watt 7763 part --  aimed at HPC, enterprise and cloud workloads. Notably, the third-gen Epyc Milan chips achieve 19 percent... Read more…

HPE Names Justin Hotard New HPC Chief as Pete Ungaro Departs

March 2, 2021

HPE CEO Antonio Neri announced today (March 2, 2021) the appointment of Justin Hotard as general manager of HPC, mission critical solutions and labs, effective Read more…

Microsoft, HPE Bringing AI, Edge, Cloud to Earth Orbit in Preparation for Mars Missions

February 12, 2021

The International Space Station will soon get a delivery of powerful AI, edge and cloud computing tools from HPE and Microsoft Azure to expand technology experi Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire