Second-Order Min-Max Optimization with Lazy Hessians

📅 2024-10-12

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This work addresses the prohibitively high computational cost of second-order methods in convex–concave minimax optimization. We propose a “lazy Hessian” strategy that maintains optimal iteration complexity $mathcal{O}(varepsilon^{-3/2})$ while drastically reducing per-iteration cost. By reusing Hessian information across iterations and incorporating an adaptive update mechanism, we achieve the first total computational complexity of $ ilde{mathcal{O}}ig((N + d^2)(d + d^{2/3}varepsilon^{-2/3})ig)$, improving upon the Monteiro–Svaiter (2012) optimal method by a factor of $d^{1/3}$. We further extend the framework to the strongly convex–strongly concave setting. The theoretical analysis is rigorous, and extensive numerical experiments on both synthetic and real-world datasets confirm accelerated convergence and reduced resource consumption.

Technology Category

Application Category

📝 Abstract

This paper studies second-order methods for convex-concave minimax optimization. Monteiro and Svaiter (2012) proposed a method to solve the problem with an optimal iteration complexity of $mathcal{O}(epsilon^{-3/2})$ to find an $epsilon$-saddle point. However, it is unclear whether the computational complexity, $mathcal{O}((N+ d^2) d epsilon^{-2/3})$, can be improved. In the above, we follow Doikov et al. (2023) and assume the complexity of obtaining a first-order oracle as $N$ and the complexity of obtaining a second-order oracle as $dN$. In this paper, we show that the computation cost can be reduced by reusing Hessian across iterations. Our methods take the overall computational complexity of $ ilde{mathcal{O}}( (N+d^2)(d+ d^{2/3}epsilon^{-2/3}))$, which improves those of previous methods by a factor of $d^{1/3}$. Furthermore, we generalize our method to strongly-convex-strongly-concave minimax problems and establish the complexity of $ ilde{mathcal{O}}((N+d^2) (d + d^{2/3} kappa^{2/3}) )$ when the condition number of the problem is $kappa$, enjoying a similar speedup upon the state-of-the-art method. Numerical experiments on both real and synthetic datasets also verify the efficiency of our method.

Problem

Research questions and friction points this paper is trying to address.

Improving computational complexity in minimax optimization

Reducing cost by reusing Hessian across iterations

Generalizing method to strongly-convex-strongly-concave problems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reusing Hessian across iterations reduces cost

Improved computational complexity by factor d^1/3

Generalized method for strongly-convex-strongly-concave problems

🔎 Similar Papers

Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee