Today’s Outlook: GPU-accelerated Weather Forecasting

By John Russell

September 15, 2015

Weather forecasting has always been challenging. When Hurricane Sandy raced up the Atlantic coast, U.S.-run models famously suggested it would track out to sea while a European model predicted a sharp left and landfall, which sadly happened. Forecasting is a high stakes activity. Today, the Swiss Federal Office of Meteorology and Climatology (MeteoSwiss) announced use of the first GPU-accelerated supercomputer for weather forecasting.

The new system, a Cray (NASDAQ: CRAY) supercomputer (Cray CS-Storm, details below) accelerated by NVIDIA (NASDAQ: NVDA) Tesla K80 GPUs, is delivering 40x the power of it predecessor CPU-based system according to MeteoSwiss. According to NVIDIA, the effort culminates a roughly 2-year collaboration between NVIDIA, MeteoSwiss, the Swiss National Supercomputing Center (CSCS), Cray, and the Consortium for Small Scale Modeling (COSMO) to improve MeteoSwiss’s weather forecasting capability.

NVIDIA, of course, hopes GPU-based acceleration will become an essential tool in the weather forecasting community. Each node on the new MeteoSwiss system has 8 GPUs, which cumulatively deliver 90% of the flops (48 CPUs and 192 K80s). Currently the new system and older system are both running with plans to phase out the earlier sometime in 2016 when the new system is fully operational.

A weather model samples the state of the atmosphere at a given time, and uses fluid motion and thermodynamics equations to predict the state of the atmosphere at some time in the future. The model divides a forecast region into a grid, and the equations are solved within each grid cell with interactions between the neighboring cells to compute a prediction. The closer grid points are to one another, the higher the overall model resolution which leads to increased realism in the final forecast.

MeteoSwiss SuperpcomputerOptimizing COSMO code was a significant effort and important to achieving performance gains. Weather forecasting apps are typically 10–20-years old or more, and tend to be written in Fortran, according to Roy Kim, group product manager of accelerated computing at NVIDIA. A combination of CUDA and OpenACC were used to optimize and port the code for GPUs.

“They [team working on the project] used OpenACC to make the code portable and maintainable. For the CUDA portion, they created a library called Stella which allowed the code to be readable,” said Kim.

The work seems to have paid off. MeteoSwiss runs both 24-hour, hourly forecasts and medium range forecasts of a few days. Before switching to the GPU-accelerated system, 24-hour forecasts models were based on 2.2km grids and eight simulations were run per day. After the switch, grid granularity improved to 1.1km. Medium range forecasts had used a 6.6km grid, were run three times a day, and yielded a 3-day forecast; afterward the grid resolution shrunk to 2.2km with 42 simulations run per day and produced a 5-day forecast.

“Previously, they had difficult modeling formations of storm clouds at 2.2 km resolution but at 1.1km they’re able to model for storm clouds quite precisely. They are also now able to run ensembles of simulations instead of just one,” said Kim. It’s expected the improved model fidelity will also allow tracking the quickly changing microclimates associated with elevation changes in the Swiss Alps.

MeteoSwiss Weather Forecasting

Significantly, the improved and ported code is now part of COSMO’s general distribution, which NVIDIA hopes will stimulate more activity among the large community of COSMO users, mostly based in Europe.

The two cabinets of the Cray CS-Storm supercomputer at CSCS are tightly packed. Each cabinet consists of 12 hybrid computing nodes for a total of 96 NVIDIA Tesla K80 GPU accelerators and 24 Intel Haswell CPUs. The GPUs are one of the key elements of the new computer system. They allow simulations which are three times more energy-efficient and twice as fast as conventional CPUs. “High-quality weather forecasts always depend upon processing power,” said CSCS Director Thomas Schulthess. “With the GPUs and the revised model, we can compute weather simulations more quickly and accurately than with conventional systems – and more energy and cost-efficient.”

“The ground-breaking use of the high-density GPU-based Cray CS-Storm system to run operational weather forecasts for the very first time is the direct result of the strong, collaborative partnership between CSCS, MeteoSwiss, NVIDIA and Cray,” said Barry Bolding, Cray’s Senior Vice President and Chief Strategy Officer. “With an eight-to-two ratio of GPUs to CPUs, the Cray CS-Storm system will provide MeteoSwiss with a powerful tool for running more detailed and higher-resolution weather forecasts.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's latest weapon in the AI battle with GPU maker Nvidia and clou Read more…

ISC 2024 Student Cluster Competition

May 16, 2024

The 2024 ISC 2024 competition welcomed 19 virtual (remote) and eight in-person teams. The in-person teams participated in the conference venue and, while the virtual teams competed using the Bridges-2 supercomputers at t Read more…

Grace Hopper Gets Busy with Science 

May 16, 2024

Nvidia’s new Grace Hopper Superchip (GH200) processor has landed in nine new worldwide systems. The GH200 is a recently announced chip from Nvidia that eliminates the PCI bus from the CPU/GPU communications pathway.  Read more…

Europe’s Race towards Quantum-HPC Integration and Quantum Advantage

May 16, 2024

What an interesting panel, Quantum Advantage — Where are We and What is Needed? While the panelists looked slightly weary — their’s was, after all, one of the last panels at ISC 2024 — the discussion was fascinat Read more…

The Future of AI in Science

May 15, 2024

AI is one of the most transformative and valuable scientific tools ever developed. By harnessing vast amounts of data and computational power, AI systems can uncover patterns, generate insights, and make predictions that Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top500 list of the fastest supercomputers in the world. At s Read more…

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's l Read more…

Europe’s Race towards Quantum-HPC Integration and Quantum Advantage

May 16, 2024

What an interesting panel, Quantum Advantage — Where are We and What is Needed? While the panelists looked slightly weary — their’s was, after all, one of Read more…

The Future of AI in Science

May 15, 2024

AI is one of the most transformative and valuable scientific tools ever developed. By harnessing vast amounts of data and computational power, AI systems can un Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

ISC 2024 Keynote: High-precision Computing Will Be a Foundation for AI Models

May 15, 2024

Some scientific computing applications cannot sacrifice accuracy and will always require high-precision computing. Therefore, conventional high-performance c Read more…

Shutterstock 493860193

Linux Foundation Announces the Launch of the High-Performance Software Foundation

May 14, 2024

The Linux Foundation, the nonprofit organization enabling mass innovation through open source, is excited to announce the launch of the High-Performance Softw Read more…

ISC 2024: Hyperion Research Predicts HPC Market Rebound after Flat 2023

May 13, 2024

First, the top line: the overall HPC market was flat in 2023 at roughly $37 billion, bogged down by supply chain issues and slowed acceptance of some larger sys Read more…

Top 500: Aurora Breaks into Exascale, but Can’t Get to the Frontier of HPC

May 13, 2024

The 63rd installment of the TOP500 list is available today in coordination with the kickoff of ISC 2024 in Hamburg, Germany. Once again, the Frontier system at Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

Leading Solution Providers

Contributors

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

Intel Plans Falcon Shores 2 GPU Supercomputing Chip for 2026  

August 8, 2023

Intel is planning to onboard a new version of the Falcon Shores chip in 2026, which is code-named Falcon Shores 2. The new product was announced by CEO Pat Gel Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

How the Chip Industry is Helping a Battery Company

May 8, 2024

Chip companies, once seen as engineering pure plays, are now at the center of geopolitical intrigue. Chip manufacturing firms, especially TSMC and Intel, have b Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire