Tuning-free coreset Markov chain Monte Carlo

📅 2024-10-24

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

In Bayesian coreset construction, existing methods—such as Coreset MCMC—rely on manual learning-rate tuning, compromising robustness and usability of posterior approximation. This paper proposes Hot DoG, the first learning-rate-free stochastic gradient optimization framework for coreset construction. Hot DoG introduces the Hot-start Distance over Gradient metric and an adaptive gradient-distance design, enabling fully hyperparameter-free coreset selection. Integrating MCMC sampling, weighted subset optimization, and geometry-aware gradient updates, it ensures theoretical consistency while substantially improving stability. Experiments across multiple benchmark tasks demonstrate that Hot DoG achieves posterior approximation accuracy surpassing existing hyperparameter-free methods and matching optimally tuned ADAM—without any manual tuning. It thus delivers a favorable trade-off among computational efficiency, robustness, and practical applicability.

Technology Category

Application Category

📝 Abstract

A Bayesian coreset is a small, weighted subset of a data set that replaces the full data during inference to reduce computational cost. The state-of-the-art coreset construction algorithm, Coreset Markov chain Monte Carlo (Coreset MCMC), uses draws from an adaptive Markov chain targeting the coreset posterior to train the coreset weights via stochastic gradient optimization. However, the quality of the constructed coreset, and thus the quality of its posterior approximation, is sensitive to the stochastic optimization learning rate. In this work, we propose a learning-rate-free stochastic gradient optimization procedure, Hot-start Distance over Gradient (Hot DoG), for training coreset weights in Coreset MCMC without user tuning effort. Empirical results demonstrate that Hot DoG provides higher quality posterior approximations than other learning-rate-free stochastic gradient methods, and performs competitively to optimally-tuned ADAM.

Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost via Bayesian coresets

Eliminating learning rate sensitivity in coreset MCMC

Improving posterior approximation quality without tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Tuning-free coreset weight optimization

Hot-start Distance over Gradient

Improved posterior approximation quality

🔎 Similar Papers

Policy Gradients for Optimal Parallel Tempering MCMC