🤖 AI Summary
This work proposes PF-AGD, a deterministic accelerated first-order algorithm for smooth nonconvex optimization that operates without prior knowledge of the smoothness constant. By integrating adaptive backtracking with a gradient-driven restart mechanism, PF-AGD dynamically estimates local curvature on the fly, thereby eliminating reliance on theoretical parameter tuning. Notably, it achieves the best-known first-order oracle complexity of $\tilde{O}(\varepsilon^{-5/3})$ without requiring any prespecified smoothness constant—a first in the literature. Empirical evaluations demonstrate that PF-AGD outperforms existing parameter-free methods, including practical variants of AGD-Until-Guilty, and matches the performance of nonlinear conjugate gradient methods, thus offering both theoretical optimality and practical efficacy.
📝 Abstract
We introduce PF-AGD, the first parameter-free, deterministic, accelerated first-order method to achieve $O(ε^{-5/3}\log(1/ε))$ oracle complexity bound when minimizing sufficiently smooth, non-convex functions; this is the best-known bound for first-order methods on smooth non-convex objectives. Unlike existing methods possessing this rate that require a priori knowledge of smoothness constants, we use an adaptive backtracking scheme and a gradient-based restart mechanism to estimate local curvature. This yields a practical algorithm that matches best-known theoretical rates. Empirically, PF-AGD outperforms the practical variant of AGD-Until-Guilty (Carmon et al., 2017), as well as other parameter-free variants, and is a viable alternative to nonlinear conjugate gradient methods.