🤖 AI Summary
This work addresses the prohibitively high computational cost of second-order methods in convex–concave minimax optimization. We propose a “lazy Hessian” strategy that maintains optimal iteration complexity $mathcal{O}(varepsilon^{-3/2})$ while drastically reducing per-iteration cost. By reusing Hessian information across iterations and incorporating an adaptive update mechanism, we achieve the first total computational complexity of $ ilde{mathcal{O}}ig((N + d^2)(d + d^{2/3}varepsilon^{-2/3})ig)$, improving upon the Monteiro–Svaiter (2012) optimal method by a factor of $d^{1/3}$. We further extend the framework to the strongly convex–strongly concave setting. The theoretical analysis is rigorous, and extensive numerical experiments on both synthetic and real-world datasets confirm accelerated convergence and reduced resource consumption.
📝 Abstract
This paper studies second-order methods for convex-concave minimax optimization. Monteiro and Svaiter (2012) proposed a method to solve the problem with an optimal iteration complexity of $mathcal{O}(epsilon^{-3/2})$ to find an $epsilon$-saddle point. However, it is unclear whether the computational complexity, $mathcal{O}((N+ d^2) d epsilon^{-2/3})$, can be improved. In the above, we follow Doikov et al. (2023) and assume the complexity of obtaining a first-order oracle as $N$ and the complexity of obtaining a second-order oracle as $dN$. In this paper, we show that the computation cost can be reduced by reusing Hessian across iterations. Our methods take the overall computational complexity of $ ilde{mathcal{O}}( (N+d^2)(d+ d^{2/3}epsilon^{-2/3}))$, which improves those of previous methods by a factor of $d^{1/3}$. Furthermore, we generalize our method to strongly-convex-strongly-concave minimax problems and establish the complexity of $ ilde{mathcal{O}}((N+d^2) (d + d^{2/3} kappa^{2/3}) )$ when the condition number of the problem is $kappa$, enjoying a similar speedup upon the state-of-the-art method. Numerical experiments on both real and synthetic datasets also verify the efficiency of our method.