Achieving 10-minute forecast intervals up to 15 hours ahead using AWS ParallelCluster
Weathernews Inc. is one of the world’s largest private meteorological company, and the global leader in weather forecasting. Weathernews uses AWS ParallelCluster to manage its computing resources to produce unprecedented high-resolution forecasts, boasting 10-minute forecast intervals up to 15 hours ahead. HPC environments in the Cloud allow flexible procurement of large amounts of computing resources, and creates an environment that can flexibly respond to load fluctuations. By using Amazon Web Services (AWS) Weathernews achieved 90%+ forecast accuracy.
“Using AWS, we were able to forecast rainclouds with 10-minute forecast intervals up to 15 hours ahead. AWS is a platform that allows engineers to freely play with their own ideas and to create new services.” –Tomohiro Ishibashi, Weathernews Inc., Managing Director, Executive Officer
A new step toward achieving the dream of meteorologists — longer-term, more accurate forecasting
Founded in 1986, Weathernews is one of the world’s largest private meteorological companies with sales and operations bases in major cities around the world, serving approximately 50 countries worldwide. The company’s services ranges from shipping and aviation weather to rail, road, and retailer weather. It is also known for its “Weathernews” mobile application.
For the most part, the meteorological industry obtains its base information from government agencies. Since it was established, Weathernews has built its own infrastructure for observation, communication networks, image processing, and distribution. In 2005, the company developed the Original Weather Numerator (OWN) as its own weather forecast model. To process the data it built an on-premises high performance computing (HPC) system, which had been continuously enhanced by increasing the number of servers. This enabled them to forecast at one-hour intervals up to three days ahead. In addition, applying artificial intelligence (AI) technology to radar and weather reports received from application users enabled forecasting at 10-minute intervals up to three hours ahead, and at 1-hour intervals up to 15 hours ahead.
“However, it was pointed out that at one-hour intervals, ‘rainclouds suddenly became blurry and difficult to make out’. In recent years, there has been an increase in rapid weather changes such as thunderstorm or localized downpour. This has corresponded to an increased need for high-resolution forecasting over longer periods of time. Also, being able to make detailed weather forecasts further into the future has long been a dream for us meteorologists” said Tomohiro Ishibashi, Managing Director and Executive Officer at Weathernews.
The company then started looking for a service capable of forecasting at 10-minute intervals up to 15 hours ahead. Its aim was to improve the previous forecasting interval from every six hours (four times a day) to every three hours (eight times a day).
Adoption of AWS ParallelCluster to flexibly secure large amounts compute resources
The main challenge was to procure the large amounts of computational resources needed for forecasting. Forecast Center Development Team leader, Kohei Sakamoto reflected, “Adding on-premises resources, as we did previously, would require a huge investment,” and as the numbers of servers increase, so do concerns of failure and higher operational burdens. In addition, June to October experience more frequent typhoons and thunderstorms or localized downpours in Japan, and thus require more computational resources than during other seasons. It was difficult for Weathernews to respond to such load fluctuations in a flexible manner using their on-premises environment.
In 2018, Weathernews began exploring a next-generation OWN cloud implementation utilizing AWS ParallelCluster. After thorough validation, the decision to adopt AWS ParallelCluster was made in April 2020. “By performing actual model calculations, we validated performance using the relationship between the number of AWS ParallelCluster instances, the effectiveness of Elastic Fabric Adapter (EFA), and changes in processing speed depending on the instance type,” said Kazunari Takahashi from the Forecast Center Development Team. Initially, there were some concerns about scalability when using MPI (Message Passing Interface) in cloud HPC, but actual measurements showed speed improvements in the range of 5,000 vCPU. Utilizing EFA, a low-latency network adapter for workloads that require high bandwidth inter-node communications, like MPI based workloads, increased calculation speeds 25%.
To ensure reliability, the company is building a main system and a sub-system in two separate AWS Regions. Processing will normally be carried out on the main system in the Northern Virginia Region, but in the event of failure, it will revert to the sub-system in the Tokyo Region for reprocessing. In addition, the main system environment uses Amazon EC2 Spot Instances. Amazon EC2 Spot Instances are spare compute capacity available at up to a 90% discount over On-Demand Instance pricing. The company worked with AWS Solution Architects to configure the system to meet availability, cost, and performance requirements of the next generation of systems.
Learn more about how Weathernews optimized their weather forecasting simulations on AWS here.
Reminder: You can learn a lot from AWS HPC engineers by subscribing to the HPC Tech Short YouTube channel, and following the AWS HPC Blog channel.