AutoGD: Automatic Learning Rate Selection for Gradient Descent

📅 2025-10-10

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

First-order optimization methods such as gradient descent heavily rely on manual learning rate tuning—especially impractical in nested optimization settings. This paper proposes AutoGD, a fully automated, prior-free gradient descent algorithm with adaptive step sizes. Its core mechanism dynamically adjusts the step length based on iterative gradient variation, and is theoretically proven to recover the optimal convergence rates of standard gradient descent for both smooth convex and nonconvex functions. We further extend this adaptive principle to the quasi-Newton framework, yielding AutoBFGS and AutoL-BFGS. Empirical evaluations demonstrate that AutoGD and its variants consistently outperform fixed-step methods and mainstream adaptive optimizers—including Adam and AdaGrad—across classical optimization benchmarks and variational inference tasks. The proposed methods combine rigorous theoretical guarantees with broad practical applicability.

Technology Category

Application Category

📝 Abstract

The performance of gradient-based optimization methods, such as standard gradient descent (GD), greatly depends on the choice of learning rate. However, it can require a non-trivial amount of user tuning effort to select an appropriate learning rate schedule. When such methods appear as inner loops of other algorithms, expecting the user to tune the learning rates may be impractical. To address this, we introduce AutoGD: a gradient descent method that automatically determines whether to increase or decrease the learning rate at a given iteration. We establish the convergence of AutoGD, and show that we can recover the optimal rate of GD (up to a constant) for a broad class of functions without knowledge of smoothness constants. Experiments on a variety of traditional problems and variational inference optimization tasks demonstrate strong performance of the method, along with its extensions to AutoBFGS and AutoLBFGS.

Problem

Research questions and friction points this paper is trying to address.

Automating learning rate selection for gradient descent

Eliminating manual tuning of optimization parameters

Adapting learning rates without smoothness constant knowledge

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automatically adjusts learning rate per iteration

Recovers optimal GD rate without smoothness knowledge

Extends to quasi-Newton methods like BFGS

🔎 Similar Papers

Gradient descent with generalized Newton's method