🤖 AI Summary
This work addresses the severe convergence degradation in decentralized learning caused by the Metropolis-Hastings (MH) mechanism, which tends to trap weighted random walks in local regions. The paper formally defines this “trapping” phenomenon for the first time and proposes a hybrid MH transition strategy incorporating Lévy jumps to significantly enhance global exploration while maintaining low communication overhead. Leveraging spectral graph theory and non-i.i.d. data assumptions, the authors establish a quantitative relationship between convergence rate and key factors including data heterogeneity, network spectral gap, and jump probability. Experimental results demonstrate that the proposed method effectively mitigates trapping and accelerates convergence.
📝 Abstract
We study decentralized learning over networks where data are distributed across nodes without a central coordinator. Random walk learning is a token-based approach in which a single model is propagated across the network and updated at each visited node using local data, thereby incurring low communication and computational overheads. In weighted random-walk learning, the transition matrix is designed to achieve a desired sampling distribution, thereby speeding up convergence under data heterogeneity. We show that implementing weighted sampling via the Metropolis-Hastings algorithm can lead to a previously unexplored phenomenon we term entrapment. The random walk may become trapped in a small region of the network, resulting in highly correlated updates and severely degraded convergence. To address this issue, we propose Metropolis-Hastings with Levy jumps, which introduces occasional long-range transitions to restore exploration while respecting local information constraints. We establish a convergence rate that explicitly characterizes the roles of data heterogeneity, network spectral gap, and jump probability, and demonstrate through experiments that MHLJ effectively eliminates entrapment and significantly speeds up decentralized learning.