Second-Order Min-Max Optimization with Lazy Hessians

📅 2024-10-12
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the prohibitively high computational cost of second-order methods in convex–concave minimax optimization. We propose a “lazy Hessian” strategy that maintains optimal iteration complexity $mathcal{O}(varepsilon^{-3/2})$ while drastically reducing per-iteration cost. By reusing Hessian information across iterations and incorporating an adaptive update mechanism, we achieve the first total computational complexity of $ ilde{mathcal{O}}ig((N + d^2)(d + d^{2/3}varepsilon^{-2/3})ig)$, improving upon the Monteiro–Svaiter (2012) optimal method by a factor of $d^{1/3}$. We further extend the framework to the strongly convex–strongly concave setting. The theoretical analysis is rigorous, and extensive numerical experiments on both synthetic and real-world datasets confirm accelerated convergence and reduced resource consumption.

Technology Category

Application Category

📝 Abstract
This paper studies second-order methods for convex-concave minimax optimization. Monteiro and Svaiter (2012) proposed a method to solve the problem with an optimal iteration complexity of $mathcal{O}(epsilon^{-3/2})$ to find an $epsilon$-saddle point. However, it is unclear whether the computational complexity, $mathcal{O}((N+ d^2) d epsilon^{-2/3})$, can be improved. In the above, we follow Doikov et al. (2023) and assume the complexity of obtaining a first-order oracle as $N$ and the complexity of obtaining a second-order oracle as $dN$. In this paper, we show that the computation cost can be reduced by reusing Hessian across iterations. Our methods take the overall computational complexity of $ ilde{mathcal{O}}( (N+d^2)(d+ d^{2/3}epsilon^{-2/3}))$, which improves those of previous methods by a factor of $d^{1/3}$. Furthermore, we generalize our method to strongly-convex-strongly-concave minimax problems and establish the complexity of $ ilde{mathcal{O}}((N+d^2) (d + d^{2/3} kappa^{2/3}) )$ when the condition number of the problem is $kappa$, enjoying a similar speedup upon the state-of-the-art method. Numerical experiments on both real and synthetic datasets also verify the efficiency of our method.
Problem

Research questions and friction points this paper is trying to address.

Improving computational complexity in minimax optimization
Reducing cost by reusing Hessian across iterations
Generalizing method to strongly-convex-strongly-concave problems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reusing Hessian across iterations reduces cost
Improved computational complexity by factor d^1/3
Generalized method for strongly-convex-strongly-concave problems
🔎 Similar Papers
No similar papers found.
Lesi Chen
Lesi Chen
PhD student, IIIS, Tsinghua University
Optimization Theory
C
Chengchang Liu
The Chinese University of Hong Kong
J
Jingzhao Zhang
IIIS, Tsinghua University, Shanghai AI Lab, Shanghai Qizhi Institute