Why AI Still Struggles to Predict Extreme Weather Events
- Artificial intelligence is increasingly positioned as the successor to traditional weather forecasting due to its speed and precision.
- The study, co-authored by Sebastian Engelke, a statistics professor at the University of Geneva, evaluated leading AI weather models including Google DeepMind's GraphCast and Pangu-Weather.
- They do perform well on a lot of tasks, but for very extreme events—that are the most important for society—they still struggle
Artificial intelligence is increasingly positioned as the successor to traditional weather forecasting due to its speed and precision. However, research published in the journal Science indicates that AI models possess a significant blind spot when predicting extreme weather events, where traditional physics-based models continue to demonstrate superior performance.
The study, co-authored by Sebastian Engelke, a statistics professor at the University of Geneva, evaluated leading AI weather models including Google DeepMind’s GraphCast and Pangu-Weather. The researchers tested these systems against a database of recent extreme weather events to determine if the efficiency of AI translates to reliability during high-impact atmospheric anomalies.
They do perform well on a lot of tasks, but for very extreme events—that are the most important for society—they still struggle
Sebastian Engelke, statistics professor at the University of Geneva
The Training Data Gap
The disparity in performance stems from the fundamental difference in how AI and traditional models process information. AI weather models are trained on decades of historical meteorological data, using empirical patterns to predict future states. Which means they essentially identify a current weather pattern and reproduce the most likely outcome based on what has happened previously.
Because record-breaking events are, by definition, unprecedented, they are often missing from the training sets. Engelke noted that it’s really the lack of information in their training data that makes it almost impossible for them to forecast it
.
In contrast, traditional forecasting relies on numerical weather prediction (NWP). These systems use complex mathematical equations to represent the physical laws of the atmosphere and oceans. Because these models simulate the physical world rather than mimicking historical data, they can adapt more effectively to new or extreme conditions that have no historical precedent.
Underestimating Extreme Events
The research highlighted specific failures in AI’s ability to predict record-breaking heat, cold, and wind. One primary example cited was a heat wave in Siberia in early 2020, which resulted in melting permafrost and widespread wildfires. AI predictions tended to underestimate the peak temperatures of this event.

The severity of the Siberian heat wave was linked to broader climatic shifts; a separate study found that global warming made the event 600 times more likely to occur. Because such an extreme temperature spike is an outlier in historical records, AI models struggled to capture its magnitude.
While AI models have evolved over the past year to include probabilistic models—which predict multiple potential outcomes to increase accuracy—the reliance on historical training data remains a core limitation.
AI Successes and Integration
Despite these failures in extreme scenarios, AI remains highly effective for typical weather patterns and certain types of intense events that fall within known historical ranges. AI models are currently used to accurately predict hurricane paths and are integrated into the workflows of insurance companies, weather data firms like the Weather Company, and various government weather agencies.
Nvidia’s AI forecasting model, Atlas, demonstrated the ability to handle some intense events. In a study conducted earlier in 2026, Atlas was tested on Storm Dennis, a rapidly intensifying cyclone that hit the United Kingdom. The model successfully captured the intensity of the wind and the pressure gradient associated with the cyclone.
You can see just clearly by visualizing the magnitude of the wind and the magnitude of the pressure gradient that the model was able to capture realistically intense wind events and really intense cyclones that cause damage
Mike Pritchard, director of climate simulation research at Nvidia
Improving Forecast Extrapolation
To bridge the gap between empirical AI and physical reality, researchers are investigating ways to enhance training sets. One proposed method involves using physics-based models to simulate hypothetical record-breaking events and adding that synthetic data to the AI’s training set.
Pritchard explained that You’ll see ways to coerce physics weather models to produce especially extreme events
, which can then be sprinkled into training data to help AI models learn how to extrapolate beyond historical observations.
As these models are primarily developed by private technology companies, Engelke emphasized the necessity of independent evaluation and benchmarking. He argued that such testing is critical because these forecasts have a direct impact on public safety and societal infrastructure.
While AI continues to improve its speed and efficiency, the current research suggests that traditional physics-based forecasting will remain an essential component of meteorological science for the foreseeable future.
