Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them

June 3, 2014

Seismic Imaging at the DEEP End

Tiffany Trader
DEEP prototype hardware

With exascale presenting a much larger challenge than previous exponential computing markers, an integrated, collaborative approach is all the more necessary. While concerted funding efforts for extreme-scale computing came a bit later than many had hoped, there are several international efforts afoot currently, including the European project, DEEP.

DEEP, which stands for Dynamical ExaScale Entry Platform, is one of three exascale-enabling projects funded by the EU 7th framework programme. The three-year project began in December 2011 and includes 16 partners from eight countries with coordination provided by the Jülich Supercomputing Centre at Forschungszentrum Jülich. DEEP spans a wide array of applications, among them brain research, climatology and seismology.

Application developer Marc Tchiboukdjian has recently gone on record as it relates to his team’s experience with the DEEP system. An IT Architect at French-based geophysical services company CGG, Tchiboukdjian says his team is responsible for selecting the most appropriate hardware to run their seismic imaging applications. Then they collaborate internally with other teams to optimize these applications on the selected hardware. For the DEEP project, the team is currently analyzing how the TTI-RTM (tilted transverse isotropic reverse time migration) module – used for understanding seismic wave propagation in complex geology – performs on the DEEP architecture.

Asked about the need for exascale computing for seismic imaging workloads, Tchiboukdjian responds that seismic data processing and imaging algorithms are indeed pushing the boundaries of computing, and could do so much more if given access to exascale systems. He adds that future processing requirements will have two main drivers: “Improved data acquisition techniques in the field will result in much bigger datasets being generated with millions of traces holding more information, such as increased frequency content. Moreover, seismic algorithms will also improve and include more accurate physics for wave propagation in the modelling engines and full-waveform inversion to improve velocity models.”

Heterogenous computing, via accelerators like GPUs and coprocessors like the Intel Phi, is being looked at as an “extreme-scale” computing enabler. Current production systems at CGG are based on GPU clusters, and Tchiboukdjian relates that GPU-accelerated nodes are well suited for RTM (reverse time migration) because of the high bandwidth of the GDDR memory. By extension Tchiboukdjian is confident that the Intel Xeon Phi could be a similarly good candidate because it also has high memory bandwidth.

The DEEP project is exploring the potential of the Phi chips for its “cluster booster architecture.” The final DEEP prototype system will consist of a 128 node Eurotech Aurora Cluster and a 512 node Booster, constructed of Intel Xeon Phi co-processors (Booster Nodes), connected by the EXTOLL high-performance 3D torus network.

Tchiboukdjian speaks to some of the merits of the approach. “The DEEP cluster offers a high density thanks to an innovative and efficient cooling strategy,” he shares. “It is also more flexible compared to classical GPU-accelerated nodes. Thanks to the fast network linking the cluster nodes and the booster nodes together, you can dynamically adjust the ratio of host processing and co-processing according to the needs of each application.”

Tchiboukdjian also comments on the new software stack that has been employed for the DEEP project. It uses an innovative approach (based on a modified version of OmpSs) that allows the user to select and execute the most scalable and compute-intensive parts of the code on the Xeon Phi.

Looking ahead, the next iteration of DEEP will include upgrades that should work even better for CGG’s application needs, according to Tchiboukdjian. The upcoming second-generation Intel Xeon Phi will have more memory and the the booster nodes will use non-volatile memory, so there will be sufficient local scratch bandwidth for RTM to run smoothly, he adds.

Share This