Tackling the Problem of Distributional Shifts: Correcting Misspecified, High-Dimensional Data-Driven Priors for Inverse Problems

📅 2024-07-24

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

204K/year

🤖 AI Summary

In high-dimensional inverse problems, data-driven priors trained on mismatched distributions—such as synthetic or degraded data—induce systematic bias in posterior inference. Method: We propose a posterior-sample-based iterative retraining framework for prior calibration. This is the first approach to incorporate posterior sampling feedback into data-driven prior learning, enabling adaptive correction of misspecified high-dimensional priors. The method integrates score-based generative modeling, Bayesian inversion, and iterative reweighted training, and is applied to background image reconstruction in strong gravitational lensing. Results: Experiments demonstrate that, within a few iterations, the calibrated prior converges closely to the true population distribution, significantly reducing posterior estimation bias. The framework exhibits empirical convergence, robustness under distribution shift, and interpretability through explicit posterior feedback.

Technology Category

Application Category

📝 Abstract

Bayesian inference for inverse problems hinges critically on the choice of priors. In the absence of specific prior information, population-level distributions can serve as effective priors for parameters of interest. With the advent of machine learning, the use of data-driven population-level distributions (encoded, e.g., in a trained deep neural network) as priors is emerging as an appealing alternative to simple parametric priors in a variety of inverse problems. However, in many astrophysical applications, it is often difficult or even impossible to acquire independent and identically distributed samples from the underlying data-generating process of interest to train these models. In these cases, corrupted data or a surrogate, e.g. a simulator, is often used to produce training samples, meaning that there is a risk of obtaining misspecified priors. This, in turn, can bias the inferred posteriors in ways that are difficult to quantify, which limits the potential applicability of these models in real-world scenarios. In this work, we propose addressing this issue by iteratively updating the population-level distributions by retraining the model with posterior samples from different sets of observations, and we showcase the potential of this method on the problem of background image reconstruction in strong gravitational lensing when score-based models are used as data-driven priors. We show that, starting from a misspecified prior distribution, the updated distribution becomes progressively closer to the underlying population-level distribution, and the resulting posterior samples exhibit reduced bias after several updates.

Problem

Research questions and friction points this paper is trying to address.

High-Dimensional Data

Bayesian Methods

Machine Learning Uncertainty

Innovation

Methods, ideas, or system contributions that make the work stand out.

High-Dimensional Inverse Problems

Bayesian Methods

Strong Gravitational Lensing Reconstruction

🔎 Similar Papers

No similar papers found.