🤖 AI Summary
Efficiently tuning step sizes for unadjusted Markov kernels—such as Langevin Monte Carlo (LMC) and its dynamical variants—in Sequential Monte Carlo (SMC) remains challenging: conventional diagnostic metrics fail, and end-to-end gradient-based optimization incurs prohibitive computational cost.
Method: We propose a greedy adaptive framework based on incremental KL divergence minimization. It requires no gradients, no manual step-size initialization, and introduces the first fully automated step-size optimization algorithm for unadjusted kernels. We further design the first dedicated parameter scheduling scheme for dynamical LMC and generate complete step-size schedules from only a few base SMC runs.
Results: Experiments demonstrate speedups of over an order of magnitude versus stochastic gradient methods, with substantial gains in sampling efficiency and stability. The learned schedules generalize robustly across diverse target distributions, confirming broad applicability and resilience to distributional shifts.
📝 Abstract
The performance of sequential Monte Carlo (SMC) samplers heavily depends on the tuning of the Markov kernels used in the path proposal. For SMC samplers with unadjusted Markov kernels, standard tuning objectives, such as the Metropolis-Hastings acceptance rate or the expected-squared jump distance, are no longer applicable. While stochastic gradient-based end-to-end optimization has been explored for tuning SMC samplers, they often incur excessive training costs, even for tuning just the kernel step sizes. In this work, we propose a general adaptation framework for tuning the Markov kernels in SMC samplers by minimizing the incremental Kullback-Leibler (KL) divergence between the proposal and target paths. For step size tuning, we provide a gradient- and tuning-free algorithm that is generally applicable for kernels such as Langevin Monte Carlo (LMC). We further demonstrate the utility of our approach by providing a tailored scheme for tuning extit{kinetic} LMC used in SMC samplers. Our implementations are able to obtain a full extit{schedule} of tuned parameters at the cost of a few vanilla SMC runs, which is a fraction of gradient-based approaches.