Gradient-flow adaptive importance sampling for Bayesian leave one out cross-validation with application to sigmoidal classification models

📅 2024-02-13

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

To address the severe instability of Monte Carlo estimates for leave-one-out (LOO) cross-validation in Bayesian models—caused by high variance in importance weights—this paper proposes a gradient-flow-guided adaptive importance sampling framework. Methodologically, it introduces the gradient flow variational principle to LOO importance sampling for the first time, yielding an explicit nonlinear transformation that dynamically maps the full-data posterior to neighborhoods of individual LOO posteriors. It further develops an efficient Jacobian determinant approximation that avoids full Hessian computation. Theoretically and empirically, the method substantially reduces LOO weight variance and improves Monte Carlo integration stability—particularly in ill-posed regimes where (n ll p). Experiments demonstrate robustness and computational efficiency on sigmoid-based classification models, including logistic regression and shallow ReLU networks.

Technology Category

Application Category

📝 Abstract

We introduce gradient-flow-guided adaptive importance sampling (IS) transformations for stabilizing Monte-Carlo approximations of leave-one-out (LOO) cross-validated predictions for Bayesian models. After defining two variational problems, we derive corresponding simple nonlinear transformations that utilize gradient information to shift a model's pre-trained full-data posterior closer to the target LOO posterior predictive distributions. In doing so, the transformations stabilize importance weights. The resulting Monte Carlo integrals depend on Jacobian determinants with respect to the model Hessian. We derive closed-form exact formulae for these Jacobian determinants in the cases of logistic regression and shallow ReLU-activated artificial neural networks, and provide a simple approximation that sidesteps the need to compute full Hessian matrices and their spectra. We test the methodology on an $nll p$ dataset that is known to produce unstable LOO IS weights.

Problem

Research questions and friction points this paper is trying to address.

Stabilize importance sampling weights for Bayesian LOO cross-validation

Develop perturbative transformations to avoid model refitting

Address unstable LOO weights in high-dimensional datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bijective perturbative transformations for adaptation

Partial moment matching for posterior adjustment

Gradient flow evolution for weight stabilization

🔎 Similar Papers

Multiple importance sampling for stochastic gradient estimation