A Parameter-Free First-Order Algorithm for Non-Convex Optimization with $\tilde{\mkern1mu O}(ε^{-5/3})$ Global Rate

📅 2026-05-03

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work proposes PF-AGD, a deterministic accelerated first-order algorithm for smooth nonconvex optimization that operates without prior knowledge of the smoothness constant. By integrating adaptive backtracking with a gradient-driven restart mechanism, PF-AGD dynamically estimates local curvature on the fly, thereby eliminating reliance on theoretical parameter tuning. Notably, it achieves the best-known first-order oracle complexity of $\tilde{O}(\varepsilon^{-5/3})$ without requiring any prespecified smoothness constant—a first in the literature. Empirical evaluations demonstrate that PF-AGD outperforms existing parameter-free methods, including practical variants of AGD-Until-Guilty, and matches the performance of nonlinear conjugate gradient methods, thus offering both theoretical optimality and practical efficacy.

📝 Abstract

We introduce PF-AGD, the first parameter-free, deterministic, accelerated first-order method to achieve $O(ε^{-5/3}\log(1/ε))$ oracle complexity bound when minimizing sufficiently smooth, non-convex functions; this is the best-known bound for first-order methods on smooth non-convex objectives. Unlike existing methods possessing this rate that require a priori knowledge of smoothness constants, we use an adaptive backtracking scheme and a gradient-based restart mechanism to estimate local curvature. This yields a practical algorithm that matches best-known theoretical rates. Empirically, PF-AGD outperforms the practical variant of AGD-Until-Guilty (Carmon et al., 2017), as well as other parameter-free variants, and is a viable alternative to nonlinear conjugate gradient methods.

Problem

Research questions and friction points this paper is trying to address.

non-convex optimization

parameter-free algorithm

first-order method

oracle complexity

smoothness constant

Innovation

Methods, ideas, or system contributions that make the work stand out.

parameter-free

non-convex optimization

accelerated first-order method