Stochastic Adaptive Gradient Descent Without Descent

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the reliance of stochastic gradient descent (SGD) on manually tuned step sizes in convex stochastic optimization, this paper proposes a fully hyperparameter-free adaptive step-size strategy. The method accesses only first-order stochastic gradient information via an oracle and dynamically captures the local geometric structure of the objective function—marking the first successful extension of “descent-free” adaptive gradient ideas to the stochastic optimization setting. Theoretically, under standard convexity and gradient variance assumptions, we establish a rigorous $O(1/sqrt{T})$ convergence rate. Empirically, the method matches the performance of carefully tuned baseline algorithms across diverse tasks—including logistic regression, neural network training, and robust optimization—while requiring zero hyperparameter configuration. This eliminates manual tuning overhead, significantly enhancing algorithmic robustness, reproducibility, and practical applicability.

Technology Category

Application Category

📝 Abstract
We introduce a new adaptive step-size strategy for convex optimization with stochastic gradient that exploits the local geometry of the objective function only by means of a first-order stochastic oracle and without any hyper-parameter tuning. The method comes from a theoretically-grounded adaptation of the Adaptive Gradient Descent Without Descent method to the stochastic setting. We prove the convergence of stochastic gradient descent with our step-size under various assumptions, and we show that it empirically competes against tuned baselines.
Problem

Research questions and friction points this paper is trying to address.

Introduces adaptive step-size for stochastic convex optimization
Uses first-order oracle without hyper-parameter tuning
Proves convergence and competes against tuned baselines
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stochastic adaptive step-size strategy
No hyper-parameter tuning required
First-order stochastic oracle utilization
🔎 Similar Papers
No similar papers found.
J
Jean-François Aujol
Univ. Bordeaux, CNRS, Bordeaux INP, IMB, UMR 5251, F-33400 Talence, France
Jérémie Bigot
Jérémie Bigot
Université de Bordeaux
StatistiqueTraitement du signal et de l'image
C
Camille Castera
Univ. Bordeaux, CNRS, Bordeaux INP, IMB, UMR 5251, F-33400 Talence, France