There are several limitations to performing HPC in a public cloud, a few specific to computational fluid dynamics (CFD). An intensive CFD application will, like other parallel scientific computing applications, have to map out pieces of itself in parallel and report back to the source several times before synthesizing.
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
For CPUs in Amazon’s EC2 cluster, they found that the application running on 8 CPU cores had an efficiency of 70 percent relative to a non-virtualized HPC cluster. “Beyond that limit, we run into network interconnect bandwidth problems if we do not reserve more instances. After an explicit request for more CPU compute instances, we have seen even for up to 256 CPU cores / 32 instances an acceptable parallel efficiency of more than 50 percent.”
With regard to the GPUs, they found similar acceptable levels at 8 GPU cores but the performance petered out with further scaling. “Overall, we expect an acceptable scaling on more than 128 CPU cores or more than 8 GPUs if we pre-request an appropriate number of instances and avoid ECC in the case of GPUs.”
In short, according to the researchers,“we believe that Amazon’s HPC cloud is well prepared for moderately sized parallel CFD problems on up to 64 CPU cores or 8 GPUs.” That bodes well for the future of mid-level scientific HPC performed in the cloud.