🤖 AI Summary
This work investigates multi-agent learning dynamics in periodic zero-sum games, focusing on equilibrium deviation when the learning rate synchronizes with the environmental oscillation frequency. Using dynamical systems modeling, Lyapunov stability analysis, and numerical simulations, we rigorously prove that exact synchronization prevents convergence to the Nash equilibrium and induces divergence of the time-averaged strategy trajectory—revealing, for the first time, a “synchronization-induced divergence” phenomenon. This finding challenges the conventional belief that time averages necessarily converge in periodic zero-sum games and establishes a new principle: “synchronization determines convergence.” We further demonstrate the robustness of this principle under generalized settings, including non-uniform periods and heterogeneous learning rates. Empirical results show a sharp phase transition in time-average convergence as synchronization degree varies, providing a mechanistic explanation and a quantitative criterion for convergence behavior in dynamic game learning.
📝 Abstract
Learning in zero-sum games studies a situation where multiple agents competitively learn their strategy. In such multi-agent learning, we often see that the strategies cycle around their optimum, i.e., Nash equilibrium. When a game periodically varies (called a ``periodic'' game), however, the Nash equilibrium moves generically. How learning dynamics behave in such periodic games is of interest but still unclear. Interestingly, we discover that the behavior is highly dependent on the relationship between the two speeds at which the game changes and at which players learn. We observe that when these two speeds synchronize, the learning dynamics diverge, and their time-average does not converge. Otherwise, the learning dynamics draw complicated cycles, but their time-average converges. Under some assumptions introduced for the dynamical systems analysis, we prove that this behavior occurs. Furthermore, our experiments observe this behavior even if removing these assumptions. This study discovers a novel phenomenon, i.e., synchronization, and gains insight widely applicable to learning in periodic games.