From the Amazon re:Invent main stage in Las Vegas today, Amazon Web Services CEO Andy Jassy introduced Amazon FSx for Lustre, citing a growing body of applications that require the high-performance and low latencies of scale-out, parallel file systems. Based on the open source Lustre project, Amazon said its fully managed, highly parallel file system addresses the storage needs of high-performance computing, machine learning and media data processing workflows.
A set of AWS customers, said Jassy, have workloads with high throughput that need very low latency and massive parallel scale-out. They need an HPC file system, but without the management responsibilities that come along with it, he added.
“FSx for Lustre handles that very demanding set of performance characteristics, very high throughput, low-latency, hundreds of gigabytes per second, and millions of IOPs,” said the AWS chief. “It has seamless integration with S3 so you can have the data stored on S3, you can easily move it to FSx for Lustre, or you can point the FSx for Lustre at S3 and it will automatically move it over. And then when you’re done your processing, you can write that data back to S3 and shut down the Lustre file systems.”
Amazon customers can create and launch an Amazon FSx file system using the AWS Management Console, the AWS CLI, or an AWS SDK. FSx for Lustre is compatible with most popular Linux-based AMIs, including Red Hat Enterprise Linux (RHEL), CentOS, Ubuntu, and SUSE Linux.
“You can create a file system in minutes, mount it on any number of clients, and start accessing it right away,” wrote AWS Chief Evangelist Jeff Barr, in a blog post. “This is a fully managed service so there’s nothing to maintain and nothing to administer.”
Automated operations are said to eliminate the need for administrative overhead and ongoing maintenance. “Amazon FSx performs routine Lustre updates, and detects and addresses hardware issues,” notes the cloud provider.
Each file system is backed by NVMe SSD storage, provisioned in increments of 3.6 TiB. Every 1 TiB of provisioned capacity provides 200 Mbps of aggregate throughput at 10,000 IOPS. The underlying storage is non-replicated, so is not intended as a long-term repository.
Amazon FSx for Lustre is available now in the US East (N. Virginia, Ohio), US West (Oregon) and Europe (Ireland) regions. File systems can be accessed from EC2 instances or via AWS Direct Connect (which connects a customer’s existing data center or colo to AWS), or VPN. Pricing is $0.14 per GB-month in the US regions and $0.154 per GB-month in Europe.
The new Amazon FSx family, launched this week, also includes a file system for Windows environments. “Built on Windows Server, Amazon FSx for Windows File Server provides a fully compatible Microsoft Windows File System, with full integration with customers’ Active Directory environment, including Active Directory domains, Windows access controls, and a native Windows Explorer experience,” said Amazon.
Amazon FSx complies with PCI DSS, ISO 9001, 27001, 27017, and 27018, and meets HIPAA eligibility standards.
Can you help me move that?
Yesterday, Amazon announced a new managed data transfer service that it claims can run 10 times as fast as open source data transfer schemes. The company says its AWS DataSync service uses network acceleration to simplify and automate data transfer between on-premises storage and Amazon S3 or Amazon Elastic File System. “AWS DataSync automatically handles many of the tasks related to data transfers that can slow down migrations or burden IT operations, including running one’s own instances, handling encryption and managing scripts,” said Amazon.
Univa is a DataSync launch partner. “As HPC workloads dynamically migrate to AWS, the data and results need to move as well,” said Rob Lalonde, vice president and general manager of cloud at Univa Corporation. “AWS DataSync coupled with our Navops Launch automation provides for data movement to be optimized so that just the required data is migrated to the cloud, and just the results are returned. This automation minimizes unnecessary data movement and reduces the time required between the creation of an instance, the movement of the data and the execution of the workload. In HPC, time is money and AWS DataSync saves both.”