🤖 AI Summary
This work addresses the challenge of data assimilation in nonlinear, non-Gaussian systems under model-agnostic (black-box) conditions by proposing a closed-form conditional diffusion model that requires no neural network training. The method constructs a joint distribution of states and observations via kernel density estimation and derives an analytical expression for the conditional score function, enabling efficient sampling and state estimation. As the first framework to integrate closed-form score functions with diffusion models for data assimilation, it eliminates the reliance of traditional filters on Gaussian assumptions and explicit dynamical models. Experiments on the Lorenz-63 and Lorenz-96 systems demonstrate superior performance over ensemble Kalman filters and particle filters, even with small to moderate ensemble sizes.
📝 Abstract
We propose closed-form conditional diffusion models for data assimilation. Diffusion models use data to learn the score function (defined as the gradient of the log-probability density of a data distribution), allowing them to generate new samples from the data distribution by reversing a noise injection process. While it is common to train neural networks to approximate the score function, we leverage the analytical tractability of the score function to assimilate the states of a system with measurements. To enable the efficient evaluation of the score function, we use kernel density estimation to model the joint distribution of the states and their corresponding measurements. The proposed approach also inherits the capability of conditional diffusion models of operating in black-box settings, i.e., the proposed data assimilation approach can accommodate systems and measurement processes without their explicit knowledge. The ability to accommodate black-box systems combined with the superior capabilities of diffusion models in approximating complex, non-Gaussian probability distributions means that the proposed approach offers advantages over many widely used filtering methods. We evaluate the proposed method on nonlinear data assimilation problems based on the Lorenz-63 and Lorenz-96 systems of moderate dimensionality and nonlinear measurement models. Results show the proposed approach outperforms the widely used ensemble Kalman and particle filters when small to moderate ensemble sizes are used.