Fine-Tuning Severe Hail Forecasting with Machine Learning

By Sean Thielen

July 20, 2017

Depending on whether you’ve been caught outside during a severe hail storm, the sight of greenish tinted clouds on the horizon may cause serious knots in the pit of your stomach, or at least give you pause. There’s good reason for that instinctive reaction. Just consider that a massive hail storm that battered the Denver metro area with golf ball-size hail on May 8, 2017, is expected to result in more than 150,000 car insurance claims and an estimated $1.4 billion in damage to property in and around Denver. The fact is that even in 2017, emergency responders, airports and everyone else going about their business must gamble with forecast uncertainties about hail. So how great would it be if you could get accurate warnings highlighting the path of severe hail storms, along with expected hail size, 1–3 hours before a storm passes through?

If the Severe Hail Analysis and Prediction (SHARP) project, which is funded through a grant from the National Science Foundation (NSF), accomplishes its goal of developing an accurate “warn-on-forecast” model for severe hail storms, this could happen in the next five to 10 years. Of course, there is a lot of scientific work to be done in the meantime, along with a need for significantly more computing power.

A two-pronged approach to hail research

The Center for Analysis and Prediction of Storms (CAPS) at the University of Oklahoma (OU) undertook the SHARP project in 2014 after hypothesizing that hail representation in numerical weather prediction (NWP) models, which mathematically model atmospheric physics to predict storms, could be improved by assimilating data from a host of data sources, and that advanced data-mining techniques could improve predictions of hail size and coverage.

Nathan Snook and Amy McGovern, two of the co-principal investigators on the project, say that CAPS pursues its hail research on two fronts. On one front, large amounts of data from various weather observing systems are ingested into weather models to create very high resolution forecasts. The other uses machine learning to sift through weather model output from CAPS and the National Center for Atmospheric Research to discover new knowledge hidden in large data sets and perform post-model correction calibrations to produce more skillful forecasts. For nearly four years, these projects have relied on the Texas Advanced Computer Center’s (TACC) Stampede system, an important part of NSF’s portfolio for advanced computing infrastructure that enables cutting-edge foundational research for computational and data-intensive science and engineering.

The high-resolution modeling is currently post-event and area specific, while the machine learning analysis is done in real time on a nationwide basis. The reason for the difference comes down to workload sizes. “For the high-resolution work, we use data from interesting historical cases to try to accurately predict the size and scope of hail that passes through a specific area,” explains Snook, a CAPS research scientist who focuses on the warn-on-forecast work. “We deal with 1 to 5 TB of data for each case study that we run, and run different experiments on different days, so our computing demands are enormous and the current available resources simply aren’t powerful enough for real-time analysis.”

McGovern, an associate professor of computer science and adjunct associate professor in the School of Meteorology at OU, says that although the machine learning algorithms are computationally intensive to train, it’s no problem to run them in real time because they are at a much coarser resolution than the data sets that Snook’s team uses (3km vs. 500m) and require fewer resources. “Our resource challenges are mainly around having enough storage and bandwidth to transfer all of the data we need on a daily basis…the data sets come from all over the U.S. and they are quite large, so there are a lot of I/O challenges,” explains McGovern.

Both research efforts rely heavily on data from the NOAA Hazardous Weather Testbed (HWT) to support their experiments. “The HWT gathers a wealth of numerical forecast data by collecting forecasts from various research institutions for about five to six weeks every spring. We use that data for a lot of our high-resolution experiments as well for our machine learning. It’s ideal for the machine learning work because it’s a big data set that is relatively stable from year to year,” says Snook.

Chipping away at high-resolution, real time forecasts

CAPS primarily uses two models for its high-resolution research, including the Weather and Research Forecasting (WRF) model, a widely used mesoscale numerical weather prediction system, and in-house model called the Advanced Regional Prediction System (ARPS). Snook says ARPS is also tuned for mesoscale weather analysis and is quite effective at efficiently assimilating radar and surface observations from a lot of different sources. In fact, to achieve greater accuracy in its warn-on-forecast modeling research, the CAPS team uses models with grid points spaced every 500m, as opposed to the 3km spacing typical in many operational high-resolution models. CAPS made the six-fold increase to better support probabilistic 1-3 hour forecasts of the size of hail and the specific counties and cities it will impact. Snook notes that the National Weather Service is moving toward the use of mesoscale forecasts in severe weather operations and that his team’s progress so far has been promising. In several case studies, their high-resolution forecasts have skillfully predicted the path of individual hailstorms up to three hours in advance—one such case is shown in figure 1.

Figure 1: A comparison of radar-indicated hail versus a 0–90 minute CAPS hail forecast for a May 20, 2013 storm in Oklahoma (inset photo shows image of actual hail from the storm).

While the CAPS team is wrapping up the first phase of its research, Snook and his team have identified areas where they need to further improve their model, and are submitting a new proposal to fund additional work. “As you can imagine, we’re nowhere near the computing power needed to track every hailstone and raindrop, so we’re still dealing with a lot of uncertainty in any storm… We have to make bulk estimates about the types of particles that exist in a given model volume, so when you’re talking about simulating something like an individual thunderstorm, it’s easy to introduce small errors which can then quickly grow into large errors throughout the model domain,” explains Snook. “Our new focus is on improving the microphysics within the model—that is, the parameters the model uses to define precipitation, such as cloud water, hail, snow or rain. If we are successful at that, we could see a large improvement in the quality of hail forecasts.”

Going deeper into forecast data with machine learning

Unlike with the current high-resolution research, CAPS runs the machine learning prediction portion of the project using near real-time daily forecast data from the various groups participating in the HWT. CAPS compares daily realtime forecast data against historical HWT data sets using a variety of algorithms and techniques to flush out important hidden data in forecasted storms nationwide. “Although raw forecasts provide some value, they include a lot of additional information that’s not immediately accessible. Machine learning methods are better at predicting the probability and potential size, distribution and severity of hail 24 to 48 hours in advance,” explains McGovern. “We are trying to improve the predictions from what SPC and the current models do.”

Figure 2 is a good illustration for how the machine learning models have improved the prediction of events for a specific case study. The figure, which highlights storms reported in the southern plains on May 27, 2015, compares the predictions using three different methods:

• Machine learning (left)

• A single parameter from the models, currently used to estimate hail (middle)

• A state-of-the-art algorithm currently used to estimate hail size (right)

The green circles show a 25 mile or 40 km radius around hail reports from that day, and the pink colors show the probability of severe hail, as predicted by each model. Although the updraft helicity model (middle) has the locations generally right, the probabilities are quite low. HAILCAST (right) overpredicts hail in the southeast while missing the main event in Oklahoma, Kansas, and Texas. The machine learning model (left) has the highest probabilities of hail exactly where it occurred. In general, this is a good example for how machine learning is now outperforming current prediction methods.

Currently, McGovern’s team is focusing on two aspects of hail forecasts: “First, we are working to get the machine learning methods into production in the Storm Prediction Center to support some high-resolution models they will be releasing. Second, we are improving the predictions by making use of the full 3D data available from the models,” explains McGovern.

Figure 2: A case study that shows the superior accuracy of the machine learning methods (left) compared to other methods.

A welcome resource boost

Snook says that the machine learning and high resolution research have generated close to 100TB of data each that they are sifting through, so access to ample computing resources is essential to ongoing progress. That’s why Snook and McGovern are looking forward to being able to utilize TACC’s Stampede2 system which, in May, began supporting early users and will be fully deployed to the research community later this summer. The new system from Dell includes 4,200 Intel Xeon Phi processors and 1,736 Intel Xeon processors as well as Intel Omni-Path Architecture Fabric, a 10GigE/40GigE management network, and more than 600 TB of memory. It is expected to double the performance of the previous Stampede system with a peak performance of up to 18 petaflops.

McGovern’s team also runs some of the machine learning work locally on the Schooner system at the OU Supercomputing Center for Education and Research (OSCER). Schooner, which includes a combination of Dell PowerEdge R430 and R730 nodes that are based on the Intel Xeon processor E5-2650 and E5-2670 product families as well as more than 450TB of storage, has a peak performance of 346.9 teraflops. “Schooner is a great resource for us because they allow ‘condo nodes’ so we can avoid lines and we also have our own disk that doesn’t get wiped every day,” says McGovern. “It’s also nice being able to collaborate directly with the HPC experts at OSCER.”

Between the two systems, Snook and McGovern expect to continue making steady progress on their research. That doesn’t mean real-time, high-resolution forecasts are right around the corner, however. “I hope that in five to 10 years, the warn-on-forecast approach becomes a reality, but it’s going to take a lot more research and computing power before we get there,” says Snook.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

GDPR’s Impact on Scientific Research Uncertain

May 24, 2018

Amid the angst over preparations—or lack thereof—for new European Union data protections entering into force at week’s end is the equally worrisome issue of the rules’ impact on scientific research. Among the Read more…

By George Leopold

Intel Pledges First Commercial Nervana Product ‘Spring Crest’ in 2019

May 24, 2018

At its AI developer conference in San Francisco yesterday, Intel embraced a holistic approach to AI and showed off a broad AI portfolio that includes Xeon processors, Movidius technologies, FPGAs and Intel’s Nervana Neural Network Processors (NNPs), based on the technology it acquired in 2016. Read more…

By Tiffany Trader

Pattern Computer – Startup Claims Breakthrough in ‘Pattern Discovery’ Technology

May 23, 2018

If it weren’t for the heavy-hitter technology team behind start-up Pattern Computer, which emerged from stealth today in a live-streamed event from San Francisco, one would be tempted to dismiss its claims of inventing Read more…

By John Russell

HPE Extreme Performance Solutions

HPC and AI Convergence is Accelerating New Levels of Intelligence

Data analytics is the most valuable tool in the digital marketplace – so much so that organizations are employing high performance computing (HPC) capabilities to rapidly collect, share, and analyze endless streams of data. Read more…

IBM Accelerated Insights

Mastering the Big Data Challenge in Cognitive Healthcare

Patrick Chain, genomics researcher at Los Alamos National Laboratory, posed a question in a recent blog: What if a nurse could swipe a patient’s saliva and run a quick genetic test to determine if the patient’s sore throat was caused by a cold virus or a bacterial infection? Read more…

Silicon Startup Raises ‘Prodigy’ for Hyperscale/AI Workloads

May 23, 2018

There's another silicon startup coming onto the HPC/hyperscale scene with some intriguing and bold claims. Silicon Valley-based Tachyum Inc., which has been emerging from stealth over the last year and a half, is unveili Read more…

By Tiffany Trader

Intel Pledges First Commercial Nervana Product ‘Spring Crest’ in 2019

May 24, 2018

At its AI developer conference in San Francisco yesterday, Intel embraced a holistic approach to AI and showed off a broad AI portfolio that includes Xeon processors, Movidius technologies, FPGAs and Intel’s Nervana Neural Network Processors (NNPs), based on the technology it acquired in 2016. Read more…

By Tiffany Trader

Pattern Computer – Startup Claims Breakthrough in ‘Pattern Discovery’ Technology

May 23, 2018

If it weren’t for the heavy-hitter technology team behind start-up Pattern Computer, which emerged from stealth today in a live-streamed event from San Franci Read more…

By John Russell

Silicon Startup Raises ‘Prodigy’ for Hyperscale/AI Workloads

May 23, 2018

There's another silicon startup coming onto the HPC/hyperscale scene with some intriguing and bold claims. Silicon Valley-based Tachyum Inc., which has been eme Read more…

By Tiffany Trader

Japan Meteorological Agency Takes Delivery of Pair of Crays

May 21, 2018

Cray has supplied two identical Cray XC50 supercomputers to the Japan Meteorological Agency (JMA) in northwestern Tokyo. Boasting more than 18 petaflops combine Read more…

By Tiffany Trader

ASC18: Final Results Revealed & Wrapped Up

May 17, 2018

It was an exciting week at ASC18 in Nanyang, China. The student teams braved extreme heat, extremely difficult applications, and extreme competition in order to cross the cluster competition finish line. The gala awards ceremony took place on Wednesday. The auditorium was packed with student teams, various dignitaries, the media, and other interested parties. So what happened? Read more…

By Dan Olds

Spring Meetings Underscore Quantum Computing’s Rise

May 17, 2018

The month of April 2018 saw four very important and interesting meetings to discuss the state of quantum computing technologies, their potential impacts, and th Read more…

By Alex R. Larzelere

Quantum Network Hub Opens in Japan

May 17, 2018

Following on the launch of its Q Commercial quantum network last December with 12 industrial and academic partners, the official Japanese hub at Keio University is now open to facilitate the exploration of quantum applications important to science and business. The news comes a week after IBM announced that North Carolina State University was the first U.S. university to join its Q Network. Read more…

By Tiffany Trader

Democratizing HPC: OSC Releases Version 1.3 of OnDemand

May 16, 2018

Making HPC resources readily available and easier to use for scientists who may have less HPC expertise is an ongoing challenge. Open OnDemand is a project by t Read more…

By John Russell

MLPerf – Will New Machine Learning Benchmark Help Propel AI Forward?

May 2, 2018

Let the AI benchmarking wars begin. Today, a diverse group from academia and industry – Google, Baidu, Intel, AMD, Harvard, and Stanford among them – releas Read more…

By John Russell

How the Cloud Is Falling Short for HPC

March 15, 2018

The last couple of years have seen cloud computing gradually build some legitimacy within the HPC world, but still the HPC industry lies far behind enterprise I Read more…

By Chris Downing

Russian Nuclear Engineers Caught Cryptomining on Lab Supercomputer

February 12, 2018

Nuclear scientists working at the All-Russian Research Institute of Experimental Physics (RFNC-VNIIEF) have been arrested for using lab supercomputing resources to mine crypto-currency, according to a report in Russia’s Interfax News Agency. Read more…

By Tiffany Trader

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Deep Learning at 15 PFlops Enables Training for Extreme Weather Identification at Scale

March 19, 2018

Petaflop per second deep learning training performance on the NERSC (National Energy Research Scientific Computing Center) Cori supercomputer has given climate Read more…

By Rob Farber

AI Cloud Competition Heats Up: Google’s TPUs, Amazon Building AI Chip

February 12, 2018

Competition in the white hot AI (and public cloud) market pits Google against Amazon this week, with Google offering AI hardware on its cloud platform intended Read more…

By Doug Black

US Plans $1.8 Billion Spend on DOE Exascale Supercomputing

April 11, 2018

On Monday, the United States Department of Energy announced its intention to procure up to three exascale supercomputers at a cost of up to $1.8 billion with th Read more…

By Tiffany Trader

Lenovo Unveils Warm Water Cooled ThinkSystem SD650 in Rampup to LRZ Install

February 22, 2018

This week Lenovo took the wraps off the ThinkSystem SD650 high-density server with third-generation direct water cooling technology developed in tandem with par Read more…

By Tiffany Trader

Leading Solution Providers

SC17 Booth Video Tours Playlist

Altair @ SC17


AMD @ SC17


ASRock Rack @ SC17

ASRock Rack



DDN Storage @ SC17

DDN Storage

Huawei @ SC17


IBM @ SC17


IBM Power Systems @ SC17

IBM Power Systems

Intel @ SC17


Lenovo @ SC17


Mellanox Technologies @ SC17

Mellanox Technologies

Microsoft @ SC17


Penguin Computing @ SC17

Penguin Computing

Pure Storage @ SC17

Pure Storage

Supericro @ SC17


Tyan @ SC17


Univa @ SC17


HPC and AI – Two Communities Same Future

January 25, 2018

According to Al Gara (Intel Fellow, Data Center Group), high performance computing and artificial intelligence will increasingly intertwine as we transition to Read more…

By Rob Farber

Google Chases Quantum Supremacy with 72-Qubit Processor

March 7, 2018

Google pulled ahead of the pack this week in the race toward "quantum supremacy," with the introduction of a new 72-qubit quantum processor called Bristlecone. Read more…

By Tiffany Trader

CFO Steps down in Executive Shuffle at Supermicro

January 31, 2018

Supermicro yesterday announced senior management shuffling including prominent departures, the completion of an audit linked to its delayed Nasdaq filings, and Read more…

By John Russell

HPE Wins $57 Million DoD Supercomputing Contract

February 20, 2018

Hewlett Packard Enterprise (HPE) today revealed details of its massive $57 million HPC contract with the U.S. Department of Defense (DoD). The deal calls for HP Read more…

By Tiffany Trader

Deep Learning Portends ‘Sea Change’ for Oil and Gas Sector

February 1, 2018

The billowing compute and data demands that spurred the oil and gas industry to be the largest commercial users of high-performance computing are now propelling Read more…

By Tiffany Trader

Nvidia Ups Hardware Game with 16-GPU DGX-2 Server and 18-Port NVSwitch

March 27, 2018

Nvidia unveiled a raft of new products from its annual technology conference in San Jose today, and despite not offering up a new chip architecture, there were still a few surprises in store for HPC hardware aficionados. Read more…

By Tiffany Trader

Hennessy & Patterson: A New Golden Age for Computer Architecture

April 17, 2018

On Monday June 4, 2018, 2017 A.M. Turing Award Winners John L. Hennessy and David A. Patterson will deliver the Turing Lecture at the 45th International Sympo Read more…

By Staff

Part One: Deep Dive into 2018 Trends in Life Sciences HPC

March 1, 2018

Life sciences is an interesting lens through which to see HPC. It is perhaps not an obvious choice, given life sciences’ relative newness as a heavy user of H Read more…

By John Russell

  • arrow
  • Click Here for More Headlines
  • arrow
Share This