Many billions of years ago, the universe was a swirling pool of gas. Unraveling the story of how we got from there to here isn’t an easy task, with many simulations of large swaths of the universe taking years to complete on powerful supercomputers. In a talk for the ICM Seminars series (hosted by the Interdisciplinary Centre for Mathematical and Computational Modelling University of Warsaw), Dr. Simon Mutch highlighted how Australian research organizations are working around the computational requirements to deliver insights into the origins of the universe as we know it.
The reionization of the universe – and why we should care about it
“Immediately after the Big Bang, the universe was filled mostly just with neutral hydrogen – there wasn’t really much else there,” said Mutch, who is a postdoctoral fellow at the University of Melbourne’s ASTRO 3D Centre of Excellence and a senior research data specialist for the Melbourne Data Analytics Platform. “But then there were gravitational perturbations that caused gas to collapse in on itself, and eventually stars and galaxies began to form, and those stars gave out light, which was of high enough energy to start to ionize the surrounding neutral gas, so it stripped electrons off that neutral gas and changed its properties.”
This ionization, he explained, spread into bubbles, and as galaxies grew in number and size, the bubbles began to overlap, eventually resulting in the total reionization of the universe – some 12.5 billion years ago. The relationship between these ionized bubbles and the galaxies that birthed them a major focus for Mutch and his colleagues.
“That’s really interesting, because it means that if we can observe this reionization signal … then we can infer something about the galaxies which are driving this reionization process.” Mutch said. “What’s even more interesting is that the reionization signal is sensitive to all galaxies.” In terms of the galaxies populating the universe, he explained, relatively faint galaxies are the most common – but also the most difficult to see. “By studying this reionization structure,” he said, “we can actually learn something about the very smallest, very faintest galaxies that we can’t actually see.”
Mutch compared the process to dropping stones in a pond and studying the ripples to understand the shapes and sizes of each stone. “The main problem is: how do we connect the properties of galaxies to the signature of the bubbles that we see during reionization?” he said. “For that, we use cosmological simulations.”
The trouble with simulations
Cosmological simulations of galaxy formation build a chunk of the universe from the ground up, accounting for elements like gravity, dark matter, heating, cooling, turbulence, chemistry, supernovae, black holes, magnetic fields and more, which are all woven into hydrodynamical or mesh models.
“While these are incredibly powerful, they are extremely computationally intensive,” Mutch said. “That’s because there is a large dynamic range, both in terms of temporal and spatial resolution.” By way of examples, he discussed IllustrisTNG, a galaxy formation simulation one billion light years across that required 35 million CPU hours on the Hazel Hen supercomputer at the High Performance Computing Center (HLRS) in Stuttgart. Similarly, he said, the larger BlueTides simulation took 20 million CPU hours on the Blue Waters system at NCSA, nearly taking up the entire machine.
The necessary scale of simulating reionization compounded the high computational needs. “What we’re always doing is making this tradeoff between the amount of resolution we get in the simulation and the size of the simulation,” Mutch said. “This problem is particularly acute, though, if you’re talking about the early universe and reionization.” The reionization bubbles were tens of millions of light years across, so in order to produce a statistically relevant sample of them, you would need many bubbles – and a massive simulation.
Normally, researchers adjust parameters so they can match to the known universe nearby. But not much is known about the early universe – so instead, Mutch and his colleagues needed to run “many, many different realizations of the simulations” to test different models, feedback processes and other variables to see how they affected the ionized bubbles.
Finding a path through the cosmos
Tackling this uphill battle was the goal for the University of Melbourne’s Dark-ages, Reionization And Galaxy-formation Observables Numerical Simulation, or “DRAGONS,” program. (“We love our acronyms in astronomy,” Mutch said. “Everything needs to have a good acronym.”)
Thankfully, he said, the universe gave them a helping hand. Overall, the universe consists of around 70% dark energy, 25% dark matter and only 5% normal matter – what we interact with in our daily lives. “What this actually means is that we can do a pretty good job of simulating the position and the large-scale distribution of the matter by simply ignoring normal matter, and that makes things much easier,” Mutch said, explaining that they could ignore gas, shocks, star formation and more. “All we care about is getting the large-scale distribution of matter correct.”
So the researchers developed a N-body (particle) simulation that treated all the matter in the universe as collisionless. “We can pour all our computing power into doing this problem of gravity, essentially, and doing as big a simulation to as high a resolution as we possibly can,” Mutch said. They ran a large N-body simulation – about 300 million light years on each side – with billions of particles, each corresponding to a mass about 400 million times that of the sun.
Looking at the simulation, the researchers then identified “knots” in the images – formations called “dark matter halos” where galaxies would start to form. The researchers tracked these halos through the simulation, building hierarchical merger trees that described how the halos coalesced over time. Using a semi-analytic galaxy formation model, they then “painted on” galaxies over the halos.
“What we also did, which was unique at the time with the DRAGONS program,” Mutch said, “is that we use the information of these galaxies to calculate how many ionizing photons they were producing and then fed that into another code, called a seminumerical model, that then was able to give us what the ionization state of the volume was a function of position. So basically, it allowed us to figure out where these ionized bubbles were.”
With that process in hand, the researchers would then evolve the galaxies again, run the seminumerical model again to get the ionization results and repeat the process until the ionization was complete.
The computational implications
“What this allows us to do is to run one really expensive N-body simulation, on the order of tens or hundreds of millions of CPU hours, and just do that once,” Mutch explained. “And then we can keep running our semi-analytic model over the top of that.” The semi-analytic model, he said, took only on the order of ten CPU hours. “And that’s where things start to get really powerful.”
“What that means is we’re no longer restricted to running one really big cosmological hydrodynamic simulation once every few years and needing a large grant and a whole supercomputer to do it,” Mutch said. “Instead, we can start to explore what happens when we change different parameters in our galaxy model, and we can then see how that changes the signal from reionization.”
This capacity for rapid iteration leaves the researchers well-positioned to be able to interpret ionization results as near-future high-power telescopes like the Square Kilometre Array (which is under construction in South Africa and Australia) begin to provide large amounts of data on the radio signals produced by reionization. “So that way, when we measure the ionized bubbles,” Mutch said, “we can infer something about the galaxies.”
Mutch is also taking part in the Genesis simulations under the government-funded ASTRO 3D program. The 30 researchers under Genesis are preparing to run a “big box” simulation – half a billion light years on each side – with 80003 particles inside of it. They expect the model to be “competitive on an international scale,” Mutch said.
To conduct the reionization simulations and the Genesis simulations, Mutch and his colleagues turned to homegrown supercomputing power. Initially, they were using NCI Australia’s Raijin supercomputer, an Intel-based system delivering 1.7 Linpack petaflops that barely squeaked into the most recent Top500 list. Raijin, however, is now being decommissioned, and the researchers are helping to stress test its replacement: Gadi.
While not yet complete, Gadi will boast 3,000 Cascade Lake nodes with two 24-core CPUs and 192 GB of memory, 160 nodes with four Nvidia V100 GPUs and 50 large memory nodes with 1.5 TB of memory. Already, Gadi’s first phase – installed in 2019 – is delivering 4.4 Linpack petaflops, placing it 47th in its first appearance on the Top500 list.