On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games

📅 2025-03-04

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This paper investigates the fundamental distinction between non-ergodic convergence behaviors in game-theoretic learning, focusing on Optimistic Multiplicative Weights Update (OMWU) in $2 imes 2$ matrix games. It analyzes three iteration types—best iterate, random iterate, and last iterate—and establishes that best-iterate and random-iterate convergence are provably inequivalent. Methodologically, the authors develop a two-phase analytical framework to derive an $O(T^{-1/6})$ non-ergodic convergence rate for the best iterate, while constructing explicit counterexamples showing that no polynomial lower bound exists for random-iterate convergence. Leveraging dynamic regret analysis, duality gap metrics, and both asymptotic and non-asymptotic theory, they challenge the conventional assumption of equivalence between best and random iterates. The work provides new theoretical foundations and practical criteria for non-ergodic learning dynamics in adversarial and game-theoretic settings.

Technology Category

Application Category

📝 Abstract

Non-ergodic convergence of learning dynamics in games is widely studied recently because of its importance in both theory and practice. Recent work (Cai et al., 2024) showed that a broad class of learning dynamics, including Optimistic Multiplicative Weights Update (OMWU), can exhibit arbitrarily slow last-iterate convergence even in simple $2 imes 2$ matrix games, despite many of these dynamics being known to converge asymptotically in the last iterate. It remains unclear, however, whether these algorithms achieve fast non-ergodic convergence under weaker criteria, such as best-iterate convergence. We show that for $2 imes 2$ matrix games, OMWU achieves an $O(T^{-1/6})$ best-iterate convergence rate, in stark contrast to its slow last-iterate convergence in the same class of games. Furthermore, we establish a lower bound showing that OMWU does not achieve any polynomial random-iterate convergence rate, measured by the expected duality gaps across all iterates. This result challenges the conventional wisdom that random-iterate convergence is essentially equivalent to best-iterate convergence, with the former often used as a proxy for establishing the latter. Our analysis uncovers a new connection to dynamic regret and presents a novel two-phase approach to best-iterate convergence, which could be of independent interest.

Problem

Research questions and friction points this paper is trying to address.

Explores non-ergodic convergence in game learning dynamics.

Investigates best-iterate vs. last-iterate convergence rates.

Challenges equivalence of random-iterate and best-iterate convergence.

Innovation

Methods, ideas, or system contributions that make the work stand out.

OMWU achieves O(T^{-1/6}) best-iterate convergence

Establishes lower bound for random-iterate convergence

Introduces novel two-phase dynamic regret approach

🔎 Similar Papers

Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games