Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization

📅 2024-08-21

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 1

career value

202K/year

🤖 AI Summary

This paper studies nonconvex-concave minimax optimization problems of the form $min_{mathbf{x}} max_{mathbf{y} in mathcal{Y}} f(mathbf{x}, mathbf{y})$, where $f$ is nonconvex in $mathbf{x}$ and concave in $mathbf{y}$, and $mathcal{Y}$ is a bounded convex set. Addressing the lack of convergence of single-timescale gradient descent ascent (GDA) on such problems, we establish the first systematic convergence theory for two-timescale GDA (TTGDA) in the nonconvex-concave setting. We prove that TTGDA converges to a first-order stationary point of the objective $Phi(mathbf{x}) = max_{mathbf{y} in mathcal{Y}} f(mathbf{x}, mathbf{y})$, and derive its optimal iteration complexity bound. Furthermore, we propose an adaptive stepsize scheme that unifies treatment of both smooth and nonsmooth cases. Experiments demonstrate that our method significantly improves training stability and convergence speed in practical applications, including generative adversarial networks (GANs).

Technology Category

Application Category

📝 Abstract

We provide a unified analysis of two-timescale gradient descent ascent (TTGDA) for solving structured nonconvex minimax optimization problems in the form of $min_ extbf{x} max_{ extbf{y} in Y} f( extbf{x}, extbf{y})$, where the objective function $f( extbf{x}, extbf{y})$ is nonconvex in $ extbf{x}$ and concave in $ extbf{y}$, and the constraint set $Y subseteq mathbb{R}^n$ is convex and bounded. In the convex-concave setting, the single-timescale gradient descent ascent (GDA) algorithm is widely used in applications and has been shown to have strong convergence guarantees. In more general settings, however, it can fail to converge. Our contribution is to design TTGDA algorithms that are effective beyond the convex-concave setting, efficiently finding a stationary point of the function $Phi(cdot) := max_{ extbf{y} in Y} f(cdot, extbf{y})$. We also establish theoretical bounds on the complexity of solving both smooth and nonsmooth nonconvex-concave minimax optimization problems. To the best of our knowledge, this is the first systematic analysis of TTGDA for nonconvex minimax optimization, shedding light on its superior performance in training generative adversarial networks (GANs) and in other real-world application problems.

Problem

Research questions and friction points this paper is trying to address.

Non-convex Optimization

GDA Algorithm

Efficiency Improvement

Innovation

Methods, ideas, or system contributions that make the work stand out.

TTGDA

Non-convex-Concave Minimax Problems

Efficiency and Applicability

🔎 Similar Papers

No similar papers found.