Gravitational wave astronomy burst onto the scene with the success of the original LIGO (Laser Interferometer Gravitational-Wave Observatory) effort and has since continued with the expanded Advanced LIGO (aLIGO) Project which has now identified five binary black hole mergers producing gravitational waves (GW). New deep learning tools developed at the University of Illinois Urbana-Champaign and National Center for Supercomputing Applications (NCSA) now promise to accelerate aLIGO discovery efforts.
Writing in Physical Review last month (Deep neural networks to enable real-time multimessenger astrophysics) researchers from UL and NCSA introduce Deep Filtering, new scalable machine learning method for end-to-end time-series signal processing. Authors Daniel George and E. A. Huerta of UL and NCSA say Deep Filtering outperforms conventional machine learning techniques, achieves similar performance compared to matched filtering, while being several orders of magnitude faster, allowing real-time signal processing with minimal resources.
“An important advantage of Deep Filtering is its scalability, i.e., all the intensive computation is diverted to the one-time training stage, after which the data sets can be discarded, i.e., the size of the template banks presents no limitation when using deep learning. With existing computational resources on supercomputers, such as Blue Waters, it will be feasible to train DNNs that target a nine-dimensional parameter space within a few weeks. Furthermore, once trained these DNNs can be evaluated in real time with a single CPU, and more intensive searches over longer time periods covering a broader range of signals can be carried out with a dedicated GPU,” write the authors Daniel George and E. A. Huerta of UL and NCSA.
Given the expected growing gush of data from aLIGO the new approach is expected to pave the way for more use of deep neural networks in multimessenger physics. “Accelerating the offline Bayesian parameter estimation algorithms, which typically last from several hours to a few days, is no trivial task since they have to sample a 15-dimensional parameter space,” note the authors.
Although George and Huerta’s paper focuses on Deep Filtering’s application in aLIGO datasets, it also contains an excellent and accessible summary of machine learning and deep learning techniques and contrasting characteristics.
Deep Filtering is based on deep learning with two deep convolutional neural networks, which are designed for classification and regression, to detect gravitational wave signals in highly noisy time-series data streams and also estimate the parameters of their sources in real time. “The results indicate that Deep Filtering outperforms conventional machine learning techniques, achieves similar performance compared to matched filtering, while being several orders of magnitude faster, allowing real-time signal processing with minimal resources,” write the researchers.
In tackling the problem, the researchers divided it into two separate parts – first a classifier network to provide a confidence level for the signal detection, and a second network, referred to as the “predictor,” to estimate the parameters of the source of the signal, in this case, the component masses of the BBH. The predictor is triggered when the classifier identifies a signal with a high probability.
The researchers used both fairly simple and more complicated versions of the classifier and predictor networks and interestingly the simpler versions performed nearly as well:
“The simple classifier and predictor are only 2 MB in size each, yet they achieve excellent results. The average time taken for evaluating them per input of 1 second duration is approximately 6.7 milliseconds, and 106 microseconds using a single CPU and GPU respectively. The deeper predictor CNN, which is about 23 MB, achieves slightly better accuracy at parameter estimation but takes about 85 milliseconds for evaluation on the CPU and 535 microseconds on the GPU, which is still orders of magnitude faster than real time. Note that the current deep learning frameworks are not well optimized for CPU evaluation.
“For comparison, we estimated an evaluation time of 1.1 seconds for time-domain matched filtering on the same CPU (using two cores) with the same template bank of clean signals used for training; the results are shown in Fig. 16. This fast inference rate indicates that real-time analysis can be carried out with a single CPU or GPU, even with DNNs that are significantly larger and trained with template banks of millions of signals.6Note that CNNs can be trained on millions of inputs in a few hours using distributed training on parallel GPUs. Furthermore, the input layer of the CNNs can be modified to consider inputs/templates of any duration, which will result in the computational cost scaling linearly with the input size. Therefore, even with inputs that are 1000s long, the analysis can still be carried out in real time.”
They also assessed performance on various GPU and CPU and noted that most of the intensive training was done on NVIDIA Tesla P100 GPUs with version 11 of the Wolfram Language; however, a few test sessions were performed with NVIDIA Tesla K40, GTX 1080, and GT 940M GPUs.
The researchers conclude that DNNs for multimessenger astrophysics offers opportunities “to harness AI computing with rapidly emerging hardware architectures and software optimized for deep learning. In addition, the use of state-of-the-art HPC facilities will continue to be used to numerically model GW sources, getting insights into the physical processes that lead to EM signatures, while also providing the means to continue using distributed computing to train DNNs.”
Link to paper: https://journals.aps.org/prd/abstract/10.1103/PhysRevD.97.044039