A Theoretical Comparison of No-U-Turn Sampler Variants: Necessary and Su?cient Convergence Conditions and Mixing Time Analysis under Gaussian Targets

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the lack of a systematic theoretical comparison between the two dominant variants of the No-U-Turn Sampler (NUTS)—namely NUTS-mul and NUTS-BPS—with respect to convergence properties and mixing times. Building on Markov chain Monte Carlo theory, the paper establishes necessary and sufficient conditions for geometric ergodicity of both algorithms. It provides, for the first time, a sufficient condition guaranteeing geometric ergodicity for NUTS-mul and derives an explicit mixing time bound for NUTS-BPS under a standard Gaussian target distribution. The analysis reveals that while the two variants exhibit highly consistent qualitative behavior, NUTS-BPS enjoys a more favorable convergence rate constant. Moreover, both algorithms display mixing times that scale as $O(d^{1/4})$ with dimensionality $d$, offering theoretical justification for their effectiveness in high-dimensional Bayesian inference.

Technology Category

Application Category

📝 Abstract

The No-U-Turn Sampler (NUTS) is the computational workhorse of modern Bayesian software libraries, yet its qualitative and quantitative convergence guarantees were established only recently. A significant gap remains in the theoretical comparison of its two main variants: NUTS-mul and NUTS-BPS, which use multinomial sampling and biased progressive sampling, respectively, for index selection. In this paper, we address this gap in three contributions. First, we derive the first necessary conditions for geometric ergodicity for both variants. Second, we establish the first sufficient conditions for geometric ergodicity and ergodicity for NUTS-mul. Third, we obtain the first mixing time result for NUTS-BPS on a standard Gaussian distribution. Our results show that NUTS-mul and NUTS-BPS exhibit nearly identical qualitative behavior, with geometric ergodicity depending on the tail properties of the target distribution. However, they differ quantitatively in their convergence rates. More precisely, when initialized in the typical set of the canonical Gaussian measure, the mixing times of both NUTS-mul and NUTS-BPS scale as $O(d^{1/4})$ up to logarithmic factors, where $d$ denotes the dimension. Nevertheless, the associated constants are strictly smaller for NUTS-BPS.

Problem

Research questions and friction points this paper is trying to address.

No-U-Turn Sampler

geometric ergodicity

mixing time

NUTS variants

convergence conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

No-U-Turn Sampler

geometric ergodicity

mixing time