From Global to Local: A Scalable Benchmark for Local Posterior Sampling

📅 2025-07-28

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

Neural network loss landscapes suffer from intrinsic manifold degeneracy, rendering traditional stochastic gradient Markov chain Monte Carlo (SGMCMC) methods—reliant on global convergence assumptions—incapable of accurately characterizing local posterior geometry. To address this, we propose a novel “local posterior sampling” paradigm and introduce the first scalable benchmark for evaluating local geometric properties of posteriors, thereby overcoming the limitations of global convergence requirements. Methodologically, we integrate stochastic gradient Langevin dynamics (SGLD) with RMSProp-based preconditioning to design a local sampling strategy tailored to high-dimensional degenerate manifolds. Experiments on models with millions to hundreds of millions of parameters demonstrate that our approach significantly improves the fidelity of local posterior distribution modeling and successfully captures nontrivial local statistical structures. This work establishes a new theoretical and practical benchmark for SGMCMC in deep learning, opening avenues for both rigorous analysis and real-world deployment.

Technology Category

Application Category

📝 Abstract

Degeneracy is an inherent feature of the loss landscape of neural networks, but it is not well understood how stochastic gradient MCMC (SGMCMC) algorithms interact with this degeneracy. In particular, current global convergence guarantees for common SGMCMC algorithms rely on assumptions which are likely incompatible with degenerate loss landscapes. In this paper, we argue that this gap requires a shift in focus from global to local posterior sampling, and, as a first step, we introduce a novel scalable benchmark for evaluating the local sampling performance of SGMCMC algorithms. We evaluate a number of common algorithms, and find that RMSProp-preconditioned SGLD is most effective at faithfully representing the local geometry of the posterior distribution. Although we lack theoretical guarantees about global sampler convergence, our empirical results show that we are able to extract non-trivial local information in models with up to O(100M) parameters.

Problem

Research questions and friction points this paper is trying to address.

Understand SGMCMC interaction with neural network degeneracy

Develop benchmark for local SGMCMC sampling evaluation

Assess local posterior geometry representation in large models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Shift focus from global to local posterior sampling

Introduce scalable benchmark for local SGMCMC evaluation

RMSProp-preconditioned SGLD best captures local geometry

🔎 Similar Papers

Scalable Bayesian Learning with posteriors