Exponential Convergence of (Stochastic) Gradient Descent for Separable Logistic Regression

📅 2026-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of theoretical guarantees for exponential convergence of gradient descent (GD) and stochastic gradient descent (SGD) under large step sizes in separable logistic regression. The authors propose a non-adaptive, incrementally increasing step size schedule that ensures GD achieves exponential convergence within the stable region. For SGD, they design a lightweight adaptive step size rule that avoids line search while still attaining exponential convergence. Notably, this is the first proof that exponential acceleration for both GD and SGD can be achieved solely through structured step size growth—without entering unstable regions or requiring prior knowledge of target accuracy or optimization horizon. The resulting algorithms possess anytime properties and significantly improve upon existing polynomial convergence rates for SGD.

Technology Category

Application Category

📝 Abstract
Gradient descent and stochastic gradient descent are central to modern machine learning, yet their behavior under large step sizes remains theoretically unclear. Recent work suggests that acceleration often arises near the edge of stability, where optimization trajectories become unstable and difficult to analyze. Existing results for separable logistic regression achieve faster convergence by explicitly leveraging such unstable regimes through constant or adaptive large step sizes. In this paper, we show that instability is not inherent to acceleration. We prove that gradient descent with a simple, non-adaptive increasing step-size schedule achieves exponential convergence for separable logistic regression under a margin condition, while remaining entirely within a stable optimization regime. The resulting method is anytime and does not require prior knowledge of the optimization horizon or target accuracy. We also establish exponential convergence of stochastic gradient descent using a lightweight adaptive step-size rule that avoids line search and specialized procedures, improving upon existing polynomial-rate guarantees. Together, our results demonstrate that carefully structured step-size growth alone suffices to obtain exponential acceleration for both gradient descent and stochastic gradient descent.
Problem

Research questions and friction points this paper is trying to address.

gradient descent
stochastic gradient descent
exponential convergence
step-size schedule
separable logistic regression
Innovation

Methods, ideas, or system contributions that make the work stand out.

exponential convergence
increasing step size
stable optimization regime
separable logistic regression
stochastic gradient descent
🔎 Similar Papers
No similar papers found.
S
Sacchit Kale
Undergraduate Programme, Indian Institute of Science, Bangalore
P
Piyushi Manupriya
Dept. of Computer Science and Automation, Indian Institute of Science, Bangalore
Pierre Marion
Pierre Marion
Inria - Ecole Normale Supérieure
Machine LearningDeep Learning
Francis Bach
Francis Bach
Inria - Ecole Normale Supérieure
Machine LearningOptimization
Anant Raj
Anant Raj
Assistant Professor, Indian Institute of Science, Bengaluru
Machine LearningOptimization