🤖 AI Summary
In Bayesian coreset construction, existing methods—such as Coreset MCMC—rely on manual learning-rate tuning, compromising robustness and usability of posterior approximation. This paper proposes Hot DoG, the first learning-rate-free stochastic gradient optimization framework for coreset construction. Hot DoG introduces the Hot-start Distance over Gradient metric and an adaptive gradient-distance design, enabling fully hyperparameter-free coreset selection. Integrating MCMC sampling, weighted subset optimization, and geometry-aware gradient updates, it ensures theoretical consistency while substantially improving stability. Experiments across multiple benchmark tasks demonstrate that Hot DoG achieves posterior approximation accuracy surpassing existing hyperparameter-free methods and matching optimally tuned ADAM—without any manual tuning. It thus delivers a favorable trade-off among computational efficiency, robustness, and practical applicability.
📝 Abstract
A Bayesian coreset is a small, weighted subset of a data set that replaces the full data during inference to reduce computational cost. The state-of-the-art coreset construction algorithm, Coreset Markov chain Monte Carlo (Coreset MCMC), uses draws from an adaptive Markov chain targeting the coreset posterior to train the coreset weights via stochastic gradient optimization. However, the quality of the constructed coreset, and thus the quality of its posterior approximation, is sensitive to the stochastic optimization learning rate. In this work, we propose a learning-rate-free stochastic gradient optimization procedure, Hot-start Distance over Gradient (Hot DoG), for training coreset weights in Coreset MCMC without user tuning effort. Empirical results demonstrate that Hot DoG provides higher quality posterior approximations than other learning-rate-free stochastic gradient methods, and performs competitively to optimally-tuned ADAM.