DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants

📅 2025-11-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional data assimilation (DA) methods—such as the ensemble Kalman filter—rely on Gaussian assumptions and heuristic parameter tuning, leading to instability or substantial bias in non-Gaussian, strongly nonlinear systems. To address these limitations, we propose DAISI, a scalable DA framework based on normalizing flow generative models. DAISI maps forecast ensembles into a latent space via invertible sampling, embeds dynamical priors into a pre-trained generative model without fine-tuning, and fuses forecasts with observations through latent-space conditioning. It further constructs a data-driven prior using stochastic interpolation within the latent space, enabling efficient probabilistic inference. Experiments demonstrate that DAISI significantly outperforms conventional DA methods in high-dimensional, non-Gaussian, sparsely observed, and strongly nonlinear settings—achieving both superior filtering accuracy and numerical stability.

Technology Category

Application Category

📝 Abstract
Data assimilation (DA) is a cornerstone of scientific and engineering applications, combining model forecasts with sparse and noisy observations to estimate latent system states. Classical DA methods, such as the ensemble Kalman filter, rely on Gaussian approximations and heuristic tuning (e.g., inflation and localization) to scale to high dimensions. While often successful, these approximations can make the methods unstable or inaccurate when the underlying distributions of states and observations depart significantly from Gaussianity. To address this limitation, we introduce DAISI, a scalable filtering algorithm built on flow-based generative models that enables flexible probabilistic inference using data-driven priors. The core idea is to use a stationary, pre-trained generative prior to assimilate observations via guidance-based conditional sampling while incorporating forecast information through a novel inverse-sampling step. This step maps the forecast ensemble into a latent space to provide initial conditions for the conditional sampling, allowing us to encode model dynamics into the DA pipeline without having to retrain or fine-tune the generative prior at each assimilation step. Experiments on challenging nonlinear systems show that DAISI achieves accurate filtering results in regimes with sparse, noisy, and nonlinear observations where traditional methods struggle.
Problem

Research questions and friction points this paper is trying to address.

Develops a scalable filtering algorithm for non-Gaussian data assimilation
Uses flow-based generative models to incorporate data-driven priors
Enables accurate state estimation with sparse, noisy nonlinear observations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flow-based generative models for flexible probabilistic inference
Inverse-sampling step mapping forecast ensemble into latent space
Guidance-based conditional sampling with stationary pre-trained generative prior
🔎 Similar Papers
No similar papers found.
Martin Andrae
Martin Andrae
PhD Student, Linköping University
Machine Learning Weather PredictionProbabilistic Spatiotemporal Modeling
E
Erik Larsson
Division of Statistics and Machine Learning, Linköping University, Linköping, Sweden
So Takao
So Takao
Postdoc at California Institute of Technology
Statistical Machine LearningStochastic Differential EquationsGeometric MechanicsFluid
T
Tomas Landelius
Division of Statistics and Machine Learning, Linköping University, Linköping, Sweden; Swedish Meteorological and Hydrological Institute, Norrköping, Sweden
Fredrik Lindsten
Fredrik Lindsten
Associate Professor, Linköping University
Computational StatisticsMachine LearningMonte Carlo MethodsSystem Identification