🤖 AI Summary
Markov chains for sampling high-dimensional multimodal distributions suffer from slow mixing due to metastability, especially under worst-case initializations.
Method: Focusing on high-entropy initial distributions (e.g., product measures), we analyze the random-cluster and Potts models on the complete graph, and develop a unified framework combining high-dimensional chain projection approximation, one-dimensional stochastic process modeling, and saddle-point escape analysis to overcome technical challenges arising from non-Markovian projections.
Contribution/Results: We provide the first exact characterization of the set of initial distributions under which Chayes–Machta, Swendsen–Wang, and Glauber dynamics achieve polynomial-time rapid mixing—bypassing the exponential slowdown inherent under worst-case initialization. Our results show that, with appropriately high-entropy initialization, mixing times drop from exponential to polynomial, while modal mass is correctly preserved. This yields the first rigorous, constructive theoretical framework for efficient initialization in multimodal sampling.
📝 Abstract
A common obstruction to efficient sampling from high-dimensional distributions is the multimodality of the target distribution because Markov chains may get trapped far from stationarity. Still, one hopes that this is only a barrier to the mixing of Markov chains from worst-case initializations and can be overcome by choosing high-entropy initializations, e.g., a product or weakly correlated distribution. Ideally, from such initializations, the dynamics would escape from the saddle points separating modes quickly and spread its mass between the dominant modes. In this paper, we study convergence from high-entropy initializations for the random-cluster and Potts models on the complete graph -- two extensively studied high-dimensional landscapes that pose many complexities like discontinuous phase transitions and asymmetric metastable modes. We study the Chayes--Machta and Swendsen--Wang dynamics for the mean-field random-cluster model and the Glauber dynamics for the Potts model. We sharply characterize the set of product measure initializations from which these Markov chains mix rapidly, even though their mixing times from worst-case initializations are exponentially slow. Our proofs require careful approximations of projections of high-dimensional Markov chains (which are not themselves Markovian) by tractable 1-dimensional random processes, followed by analysis of the latter's escape from saddle points separating stable modes.