This year, Penn State launched the NEID (NN-explore Exoplanet Investigations with Doppler spectroscopy) astronomical spectrograph, part of a collaboration between NASA and the NSF aimed at producing high-precision Doppler observations of exoplanets orbiting nearby stars. The spectrograph, based at Kitt Peak National Observatory in Arizona (pictured in the header), watches the light data it collects for the miniscule gravitational effects of exoplanets on their more observable stars. Now, the Texas Advanced Computing Center (TACC) is assisting the NEID effort with its supercomputing power.
“We’re proud that NEID is available to the worldwide astronomical community for exoplanet discovery and characterization,” said Jason Wright, professor of astronomy and astrophysics at Penn State and NEID project scientist, in an interview for TACC and Penn State. “I can’t wait to see the results we and our colleagues around the world will produce over the next few years from discovering new, rocky planets, to measuring the compositions of exoplanetary atmospheres, to measuring the shapes and orientations of planetary orbits, to characterization of the physical processes of these planets’ host stars.”
Supercomputing is necessary for NEID for one core reason: the instrument collects around 150GB of light data each night. That data is sent to Caltech, then forwarded by Caltech to TACC to be processed by TACC’s in-house, automated NEID data pipeline. “The pipeline copies data to us from Caltech via the Globus research data management network,” explained Mike Packar, a senior systems administrator with TACC’s Cloud and Interactive Computing group. “A data analysis then runs on TACC’s Frontera system. It uses the Tapis API to store metadata. Then it sends the data back to Caltech for scientists to analyze.”
Frontera, the system in question, consists of 8,008 Intel Xeon-powered compute nodes, each with 192GB of memory. It has two subsystems, each of which contain four Nvidia GPUs per node – one with Quadro RTX 5000s, the other with V100s. All in all, Frontera delivers 23.5 Linpack petaflops, making it one of the most powerful publicly ranked systems in the world since its debut in 2019. The pipeline has also used compute hours on TACC’s Lonestar5 and Stampede2 systems.
The NEID collaboration also has TACC looking closely at future applications of automated data processing pipelines.
“NEID is the first of hopefully many collaborations with the NASA Jet Propulsion Laboratory (JPL) and other institutions where automated data analysis pipelines run with no human in the loop,” said Joe Stubbs, head of the Cloud and Interactive Computing group at TACC and the center’s technical lead for the NEID project. “Tapis Pipelines, a new project that has grown out of this collaboration, generalizes the concepts developed for NEID so that other projects can automate distributed data analysis on TACC’s supercomputers in a secure and reliable way with minimal human supervision.”
To learn more, read the coverage from TACC and Penn State here.