🤖 AI Summary
This work addresses the lack of rigorous theoretical analysis of the Adam-DA algorithm’s dynamics in zero-sum games. By constructing its continuous-time limit as an ordinary differential equation (ODE), the study establishes an analytically tractable dynamical systems framework to investigate local convergence properties and implicit gradient regularization mechanisms. The theoretical analysis reveals a striking reversal in the roles of the first- and second-order momentum parameters compared to their behavior in standard minimization settings—challenging conventional understanding of adaptive optimization. This counterintuitive prediction is empirically validated across multiple GAN architectures and datasets, confirming the practical significance of the momentum effect inversion and substantially advancing the understanding of Adam-DA’s optimization dynamics in adversarial learning contexts.
📝 Abstract
The remarkable success of the Adam in training neural networks has naturally led to the widespread use of its descent-ascent counterpart, Adam-DA, for solving zero-sum games. Despite its popularity in practice, a rigorous theoretical understanding of Adam-DA still lags behind. In this paper, we derive ordinary differential equations (ODEs) that serve as continuous-time limits of the Adam-DA. These ODEs closely approximate the discrete-time dynamics of Adam-DA, providing a tractable analytical framework for understanding its behavior in zero-sum games. Using this ODE approach, we investigate two fundamental aspects of Adam-DA: local convergence and implicit gradient regularization. Our analysis reveals that the roles of the first- and second-order momentum parameters in zero-sum games are exactly the opposite of their well-documented effects in minimization problems. We validate these predictions through GAN experiments across multiple architectures and datasets, demonstrating the practical implications of this reversed momentum effect.