🤖 AI Summary
We address composite nonconvex finite-sum minimization problems. We propose AAPG, a parameter-free adaptive accelerated proximal gradient method, and its stochastic variance-reduced variant AAPG-SPIDER. Both methods integrate three acceleration mechanisms: adaptive step sizes, Nesterov extrapolation, and SPIDER-type variance reduction. Under the Kurdyka–Łojasiewicz (KL) condition, we establish a non-ergodic convergence rate analysis framework. Theoretically, AAPG achieves the optimal iteration complexity of $O(Nvarepsilon^{-2})$, while AAPG-SPIDER attains $O(N + sqrt{N}varepsilon^{-2})$, marking the first attainment of optimal complexity for this class of problems. Empirical evaluations on sparse phase retrieval and linear eigenvalue problems demonstrate that our methods significantly outperform existing state-of-the-art algorithms in both convergence speed and solution quality.
📝 Abstract
This paper proposes {sf AAPG-SPIDER}, an Adaptive Accelerated Proximal Gradient (AAPG) method with variance reduction for minimizing composite nonconvex finite-sum functions. It integrates three acceleration techniques: adaptive stepsizes, Nesterov's extrapolation, and the recursive stochastic path-integrated estimator SPIDER. While targeting stochastic finite-sum problems, {sf AAPG-SPIDER} simplifies to {sf AAPG} in the full-batch, non-stochastic setting, which is also of independent interest. To our knowledge, {sf AAPG-SPIDER} and {sf AAPG} are the first learning-rate-free methods to achieve optimal iteration complexity for this class of extit{composite} minimization problems. Specifically, {sf AAPG} achieves the optimal iteration complexity of $mathcal{O}(N epsilon^{-2})$, while {sf AAPG-SPIDER} achieves $mathcal{O}(N + sqrt{N} epsilon^{-2})$ for finding $epsilon$-approximate stationary points, where $N$ is the number of component functions. Under the Kurdyka-Lojasiewicz (KL) assumption, we establish non-ergodic convergence rates for both methods. Preliminary experiments on sparse phase retrieval and linear eigenvalue problems demonstrate the superior performance of {sf AAPG-SPIDER} and {sf AAPG} compared to existing methods.