Trying to fit a model of an entire galaxy inside a computer is even harder than it sounds — even when that computer is an 800-core cluster with over a terabyte of memory. The researchers at the Durham University’s Institute for Computational Cosmology (ICC) know this well, because that just happens to be what they’re trying to do. An article in silicon.com this week documents how cosmologists have to develop creative modeling strategies to deal with the limitations of HPC machines.
ICC researchers have access to a cluster with 800 AMD processor cores, 1.6 TB of memory, and 300 TB of disk storage. That’s a decent-sized machine, but for galaxy formation simulations, the researchers are constantly butting up against hardware limitations. Take disk storage, for instance. A single simulation run on the effect of dark matter on galaxy formation can produce 20 TB of data, which mean the scientists are constantly deleting older data or backing it up to tape. And according to the article, the cluster is not big or powerful enough to even handle large scale models:
Physicists have to simplify the cosmological models they use in order to get ones that produce data sets small enough to be accurately processed by the 64-bit chips in the supercomputing cluster, and which can fit into the cluster’s available memory.
Nevertheless, this is better than what most cosmologists had available to them even a few years ago. At that time they could only simulate a few thousand particles per galaxy (so each particle had to represent 10,000 to 100,000 stars). Today that granularity is two orders of magnitude better.
Better yet, the Institute is getting a new cluster in December that has a lot more compute power, memory and storage than their current setup. The new hardware will enable the researchers to create higher fidelity models and “get a much more realistic calculation”.