🤖 AI Summary
This work addresses the limitations of traditional data assimilation methods, which suffer from error accumulation under non-Markovian observations, and the inability of existing learning-based approaches to unify filtering and smoothing tasks. The authors propose ForcingDAS, a novel framework that leverages a diffusion-forcing mechanism to learn a joint trajectory prior, replacing frame-by-frame transition models. ForcingDAS is the first single model capable of flexibly supporting the full spectrum of data assimilation—from real-time filtering to retrospective smoothing—without retraining. By assigning independent noise levels per time step, it effectively captures long-range temporal dependencies. Evaluated on two-dimensional Navier–Stokes vorticity estimation, precipitation nowcasting, and global atmospheric state reconstruction, ForcingDAS matches or surpasses specialized classical and learning-based baselines, demonstrating particularly significant improvements on real-world weather data.
📝 Abstract
Data assimilation (DA) estimates the state of an evolving dynamical system from noisy, partial observations, and is widely used in scientific simulation as well as weather and climate science. In practice, filtering methods rely on frame-to-frame transition models. However, these models are fragile when observations are non-Markovian (when they form only a partial slice of a higher-dimensional latent state as in real-world weather data): they tend to accumulate errors over long horizons. At the same time, learned DA methods typically commit to a single regime, either filtering (nowcasting, real-time forecasting) or smoothing (retrospective reanalysis), which splits what should be a shared prior across application-specific pipelines. To address both issues, we introduce ForcingDAS, a unified and robust DA framework. Built on Diffusion Forcing with an independent noise level assigned to each frame, ForcingDAS learns a joint-trajectory prior instead of frame-to-frame transitions. This allows it to capture long-horizon temporal dependencies and reduce error accumulation. In addition, the same trained model spans the full filtering to smoothing spectrum at inference time. Specifically, nowcasting, fixed-lag smoothing, and batch reanalysis are selected through the inference schedule alone, without retraining. We evaluate ForcingDAS on 2D Navier-Stokes vorticity, precipitation nowcasting, and global atmospheric state estimation. Across all settings, a single model is competitive with or outperforms both learned and classical baselines that are specialized for individual regimes, with the largest gains observed on real-world weather benchmarks.