Using HPC (High Performance Computing) to solve Computational Fluid Dynamics (CFD) challenges has become common practice. As the growth from HPC workstation to supercomputer has slowed over the last decade or two, compute clusters have increasingly taken the place of single, big SMP (shared memory processing) supercomputers, and have become the ‘new normal’. Another, more recent innovation, the cloud, has also enabled dramatic growth in total throughput.
This post will show you good practices for setting up an HPC cluster on AWS running Ansys Fluent (a commercial computational fluid dynamics software package) in just a few minutes. In addition, you will find some sample scripts to install Ansys Fluent and run your first job. ‘Best guidance’ is a relative term, and in the cloud even more so, as there are many possibilities (aka services) that can be combined in different ways to achieve the same goal. Whether one option is better than another can only be decided in the context of the specific application characteristics or application features to be used. For example, “a high performance parallel file system is better than NFS share” is true for the vast majority of HPC workloads, but there could be cases (like I/O-intensive applications, or small HPC clusters created to run few and/or small jobs) where NFS share is more than enough, and it’s cheaper and simpler to set up. In this post we will share what we consider good practices, together with some additional options – valid alternatives that you may wish to consider.
The main cluster components we will use are the following AWS services:
- AWS ParallelCluster, an AWS-supported open source cluster management tool to deploy and manage HPC clusters in the AWS cloud.
- The new Amazon EC2 C5n instances that can use up to 100 Gbps of network bandwidth.
- Amazon FSx for Lustre , a highly parallel file system that supports sub-millisecond access to petabyte-scale file systems, designed to deliver 200 MB/s of aggregate throughput at 10,000 IOPS for every 1TiB of provisioned capacity.
- Nice DCV as the remote visualization protocol.
Note: We announced Elastic Fabric Adapter (EFA) at re:Invent 2018, and have recently launched the service in multiple AWS regions. EFA is a network device that you can attach to your Amazon EC2 instances to accelerate HPC applications, providing lower and more consistent latency and higher throughput than the TCP transport traditionally used in cloud-based HPC systems. It enhances the performance of inter-instance communication critical for scaling HPC applications, and is optimized to work on the existing AWS network infrastructure. Ansys Fluent is not yet ready for use with EFA, so the use of this specific network device will not be extensively covered in this post.
Note: ANSYS Fluent is a commercial software package that requires a license. This post assumes that you already have your Ansys Fluent license on (or accessible from) AWS. Also, the installation script you will find below requires Ansys installation packages. You can download the current release from Ansys under “Downloads → Current Release”.
First step: Create a Custom AMI
To speed up cluster creation and, most importantly, to shorten the time needed to start up the compute nodes, it’s good practice to create a custom AMI that has certain packages preinstalled and settings pre-configured.
- Start based on an existing AMI and note down the AMI id appropriate to the region where you plan to deploy your cluster; see our list of AMIs by region. For example, we started with CentOS7 in Virginia (us-east-1), and the AMI ID is ami-0a4d7e08ea5178c02.
- Open the AWS Console and launch an instance in your preferred region (the same you chose your AMI from), using the ami-id as before.
- Make sure that your instance is accessible from the internet and has a public IP address.
- Give the instance an IAM role that allows it to download files from S3 (or from a specific S3 bucket).
- Optionally, tag the instance. (i.e., Name = Fluent-AMI-v1)
- Configure the security group to allow incoming connections on port 22.
- If you need additional details on how to create a custom AMI for AWS ParallelCluster, please refer to Building a custom AWS ParallelCluster AMI. the official documentation.
- Once the instance is ready, ssh into it and run the following commands as root: