Two months ago, the first-ever image of a black hole took the internet by storm. A team of scientists took years to produce and verify the striking image – and now, Tom Coughlin (on behalf of Supermicro) has provided a behind-the-scenes look at the hardware that helped make it possible.
Massive data
Producing the black hole image required processing four petabytes of data. This data – collected from a black hole 55 million light-years away – was generated by an array of eight international radio telescopes using global very long baseline interferometry (VLBI) over the course of five nights.
Each telescope was equipped with a powerful receiver and backend electronics that allowed it to mix the sky signal to the baseband, digitize it, and record it directly to the HDDs, creating raw petabytes of VLBI data. This Event Horizon Telescope (EHT) data was recorded at a rate of 64 GB/s, totaling around 350 TB per telescope per night. This necessitated 1,024 HDDs – 128 per telescope, divided into four digital backend systems with 32 drives each – and all filled with helium and sealed to ensure reliable operation at high altitudes.
Signal processing & correlation
But “the real key,” Coughlin writes, was the use of advanced signal processing algorithms. Once the data was collected, the drives were flown to the Max Planck Institute (MPI) for Radio Astronomy and to the MIT Haystack Observatory for high frequency band data analysis and low frequency band data analysis, respectively.
To process the data, researchers used DiFX software running on MIT and MPI’s HPC clusters. MIT’s cluster – comprised of 60 compute nodes (38 of the Supermicro TwinPro systems) – is housed in ten racks containing three generations of Supermicro servers, with all servers using Mellanox switches and the newest servers using 10-core Intel Xeon CPUs. MIT’s system had about half a petabyte of storage.
The MPI cluster, meanwhile, used three Supermicro head node servers, 68 compute nodes (comprising 1,360 cores), 11 Supermicro storage RAID servers (running BeeGFS) with a capacity of 1.6 petabytes, FDR InfiniBand networking and more. Supermicro also provided headend nodes – 2×2 clustered – to account for the high resource demands of DiFX.
These clusters – known as ‘Correlators’ – worked to ensure that the data from the telescopes is time- and location-aligned, accounting for the differences in distance and positioning that might otherwise disrupt accurate image processing.
As the data was correlated at the two sites, the results were crosschecked between them. After the data was correlated, it was processed for use in imaging, time-domain analysis and modeling.
The future
Coughlin also outlines the future of black hole observations. The EHT, for instance, only observed black holes in one wavelength – but soon, it hopes to look at another, improving resolution in the process. Researchers are also looking into space-based imaging (using space-based telescopes) and algorithmic processing approaches to refine the resolution of the existing images. “We could make movies instead of pictures,” said EHT Director Sheperd Doeleman.
There are many hurdles to this future, of course – first and foremost, data transmission; a space-based imaging approach would all but eliminate the possibility of physically transmitting hard drives to labs on Earth.