Adaptive Accelerated Proximal Gradient Methods with Variance Reduction for Composite Nonconvex Finite-Sum Minimization

📅 2025-02-28

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

We address composite nonconvex finite-sum minimization problems. We propose AAPG, a parameter-free adaptive accelerated proximal gradient method, and its stochastic variance-reduced variant AAPG-SPIDER. Both methods integrate three acceleration mechanisms: adaptive step sizes, Nesterov extrapolation, and SPIDER-type variance reduction. Under the Kurdyka–Łojasiewicz (KL) condition, we establish a non-ergodic convergence rate analysis framework. Theoretically, AAPG achieves the optimal iteration complexity of $O(Nvarepsilon^{-2})$, while AAPG-SPIDER attains $O(N + sqrt{N}varepsilon^{-2})$, marking the first attainment of optimal complexity for this class of problems. Empirical evaluations on sparse phase retrieval and linear eigenvalue problems demonstrate that our methods significantly outperform existing state-of-the-art algorithms in both convergence speed and solution quality.

Technology Category

Application Category

📝 Abstract

This paper proposes {sf AAPG-SPIDER}, an Adaptive Accelerated Proximal Gradient (AAPG) method with variance reduction for minimizing composite nonconvex finite-sum functions. It integrates three acceleration techniques: adaptive stepsizes, Nesterov's extrapolation, and the recursive stochastic path-integrated estimator SPIDER. While targeting stochastic finite-sum problems, {sf AAPG-SPIDER} simplifies to {sf AAPG} in the full-batch, non-stochastic setting, which is also of independent interest. To our knowledge, {sf AAPG-SPIDER} and {sf AAPG} are the first learning-rate-free methods to achieve optimal iteration complexity for this class of extit{composite} minimization problems. Specifically, {sf AAPG} achieves the optimal iteration complexity of $mathcal{O}(N epsilon^{-2})$, while {sf AAPG-SPIDER} achieves $mathcal{O}(N + sqrt{N} epsilon^{-2})$ for finding $epsilon$-approximate stationary points, where $N$ is the number of component functions. Under the Kurdyka-Lojasiewicz (KL) assumption, we establish non-ergodic convergence rates for both methods. Preliminary experiments on sparse phase retrieval and linear eigenvalue problems demonstrate the superior performance of {sf AAPG-SPIDER} and {sf AAPG} compared to existing methods.

Problem

Research questions and friction points this paper is trying to address.

Minimizes composite nonconvex finite-sum functions efficiently.

Achieves optimal iteration complexity for composite minimization problems.

Demonstrates superior performance in sparse phase retrieval and eigenvalue problems.

Innovation

Methods, ideas, or system contributions that make the work stand out.

AAPG-SPIDER integrates adaptive stepsizes and Nesterov's extrapolation.

AAPG-SPIDER uses SPIDER for variance reduction in optimization.

AAPG-SPIDER achieves optimal iteration complexity for nonconvex problems.

🔎 Similar Papers

Variance Reduction and Low Sample Complexity in Stochastic Optimization via Proximal Point Method