AutoGD: Automatic Learning Rate Selection for Gradient Descent

📅 2025-10-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
First-order optimization methods such as gradient descent heavily rely on manual learning rate tuning—especially impractical in nested optimization settings. This paper proposes AutoGD, a fully automated, prior-free gradient descent algorithm with adaptive step sizes. Its core mechanism dynamically adjusts the step length based on iterative gradient variation, and is theoretically proven to recover the optimal convergence rates of standard gradient descent for both smooth convex and nonconvex functions. We further extend this adaptive principle to the quasi-Newton framework, yielding AutoBFGS and AutoL-BFGS. Empirical evaluations demonstrate that AutoGD and its variants consistently outperform fixed-step methods and mainstream adaptive optimizers—including Adam and AdaGrad—across classical optimization benchmarks and variational inference tasks. The proposed methods combine rigorous theoretical guarantees with broad practical applicability.

Technology Category

Application Category

📝 Abstract
The performance of gradient-based optimization methods, such as standard gradient descent (GD), greatly depends on the choice of learning rate. However, it can require a non-trivial amount of user tuning effort to select an appropriate learning rate schedule. When such methods appear as inner loops of other algorithms, expecting the user to tune the learning rates may be impractical. To address this, we introduce AutoGD: a gradient descent method that automatically determines whether to increase or decrease the learning rate at a given iteration. We establish the convergence of AutoGD, and show that we can recover the optimal rate of GD (up to a constant) for a broad class of functions without knowledge of smoothness constants. Experiments on a variety of traditional problems and variational inference optimization tasks demonstrate strong performance of the method, along with its extensions to AutoBFGS and AutoLBFGS.
Problem

Research questions and friction points this paper is trying to address.

Automating learning rate selection for gradient descent
Eliminating manual tuning of optimization parameters
Adapting learning rates without smoothness constant knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automatically adjusts learning rate per iteration
Recovers optimal GD rate without smoothness knowledge
Extends to quasi-Newton methods like BFGS
🔎 Similar Papers
No similar papers found.
N
Nikola Surjanovic
University of British Columbia
A
Alexandre Bouchard-Côté
University of British Columbia
Trevor Campbell
Trevor Campbell
Associate Professor, Statistics, UBC
Machine LearningStatisticsOptimizationMathematics