🤖 AI Summary
This work addresses the challenge of simultaneously respecting manifold constraints and accurately sampling from target distributions in high-dimensional spaces by proposing the U-turn chain method. The approach constructs a Markov chain through short forward–backward steps of a diffusion model, augmented with Metropolis–Hastings corrections to ensure asymptotic correctness. Theoretical analysis reveals an ergodicity-breaking phase transition in U-turn dynamics caused by fragmentation of the data manifold and uncovers a novel phenomenon: the relaxation order of features reverses with U-turn magnitude, wherein high-level features relax before low-level ones—a reversal observed for the first time. Experiments on synthetic language, natural language, and image data demonstrate that efficient mixing and this inversion effect emerge only when the injected noise is sufficiently large.
📝 Abstract
Sampling from learned high-dimensional distributions is a foundational computational problem. We introduce U-turn chains: Markov chains obtained by iterating short forward-backward steps of a diffusion model, in which each step proposes a move that remains on the learned data manifold and, paired with a Metropolis-Hastings correction, samples from energy-modified targets. For synthetic languages, we show that minimal U-turn dynamics undergoes an ergodicity-breaking phase transition driven by fragmentation of the data manifold; ergodicity is restored at larger U-turn magnitude. In the non-ergodic regime, low-level features relax faster than high-level ones, an ordering that inverts only at sufficiently large U-turn magnitude. We test these predictions on natural language and natural images. In both modalities, minimal U-turns relax slowly, especially for high-level features approximated by deep representations in CNNs or LLMs. The layer-ordering inversion appears only at large noise when mixing is efficient -- signatures consistent with strongly constrained, weakly mixing local dynamics. We discuss the implications of these results for sampling with diffusion models.