OmniCast: A Masked Latent Diffusion Model for Weather Forecasting Across Time Scales

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

142K/year

🤖 AI Summary

To address error accumulation in autoregressive models for subseasonal-to-seasonal (S2S) weather forecasting, this paper proposes a non-autoregressive joint spatiotemporal diffusion generative framework. The method employs a VAE to learn a low-dimensional continuous latent space, integrates diffusion processes with a Transformer architecture, and enables synchronous generation of future state sequences via masked modeling—trained with token-wise diffusion heads and inferred through iterative unmasking. Compared to conventional approaches, the framework achieves comparable accuracy for medium-range forecasts while accelerating inference by 10–20×; it attains state-of-the-art performance across S2S to decadal timescales, significantly improving physical consistency and probabilistic reliability of predictions. Moreover, it supports stable, century-long rollouts—demonstrating unprecedented temporal scalability and robustness in long-horizon atmospheric modeling.

Technology Category

Application Category

📝 Abstract

Accurate weather forecasting across time scales is critical for anticipating and mitigating the impacts of climate change. Recent data-driven methods based on deep learning have achieved significant success in the medium range, but struggle at longer subseasonal-to-seasonal (S2S) horizons due to error accumulation in their autoregressive approach. In this work, we propose OmniCast, a scalable and skillful probabilistic model that unifies weather forecasting across timescales. OmniCast consists of two components: a VAE model that encodes raw weather data into a continuous, lower-dimensional latent space, and a diffusion-based transformer model that generates a sequence of future latent tokens given the initial conditioning tokens. During training, we mask random future tokens and train the transformer to estimate their distribution given conditioning and visible tokens using a per-token diffusion head. During inference, the transformer generates the full sequence of future tokens by iteratively unmasking random subsets of tokens. This joint sampling across space and time mitigates compounding errors from autoregressive approaches. The low-dimensional latent space enables modeling long sequences of future latent states, allowing the transformer to learn weather dynamics beyond initial conditions. OmniCast performs competitively with leading probabilistic methods at the medium-range timescale while being 10x to 20x faster, and achieves state-of-the-art performance at the subseasonal-to-seasonal scale across accuracy, physics-based, and probabilistic metrics. Furthermore, we demonstrate that OmniCast can generate stable rollouts up to 100 years ahead. Code and model checkpoints are available at https://github.com/tung-nd/omnicast.

Problem

Research questions and friction points this paper is trying to address.

Unifying weather forecasting across multiple time scales using latent diffusion

Mitigating error accumulation in autoregressive subseasonal-to-seasonal predictions

Modeling long-term weather dynamics through low-dimensional latent space

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses VAE to encode weather data into latent space

Employs diffusion transformer for probabilistic future token generation

Mitigates autoregressive errors via joint spatiotemporal sampling

🔎 Similar Papers

Continuous Ensemble Weather Forecasting with Diffusion models