Stochastic Gradient Descent with Strategic Querying

📅 2025-08-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the inefficiency of uniform gradient sampling in stochastic gradient descent (SGD). We propose Strategic Gradient Querying (SGQ), a method that, at each iteration, dynamically selects the gradient sample most likely to reduce the objective function—guided by the Expected Improvement criterion—rather than sampling uniformly. Under standard Polyak–Łojasiewicz and smoothness assumptions, theoretical analysis shows SGQ achieves faster transient convergence and tighter steady-state variance control than vanilla SGD; we further introduce an idealized Oracle query as a performance upper bound. Crucially, SGQ requires only a single gradient evaluation per iteration, ensuring practicality. Extensive experiments across diverse optimization tasks demonstrate that SGQ significantly accelerates convergence and consistently reduces gradient estimation variance compared to baseline SGD and several adaptive variants.

Technology Category

Application Category

📝 Abstract
This paper considers a finite-sum optimization problem under first-order queries and investigates the benefits of strategic querying on stochastic gradient-based methods compared to uniform querying strategy. We first introduce Oracle Gradient Querying (OGQ), an idealized algorithm that selects one user's gradient yielding the largest possible expected improvement (EI) at each step. However, OGQ assumes oracle access to the gradients of all users to make such a selection, which is impractical in real-world scenarios. To address this limitation, we propose Strategic Gradient Querying (SGQ), a practical algorithm that has better transient-state performance than SGD while making only one query per iteration. For smooth objective functions satisfying the Polyak-Lojasiewicz condition, we show that under the assumption of EI heterogeneity, OGQ enhances transient-state performance and reduces steady-state variance, while SGQ improves transient-state performance over SGD. Our numerical experiments validate our theoretical findings.
Problem

Research questions and friction points this paper is trying to address.

Optimizing finite-sum problems with strategic gradient queries
Improving transient performance over uniform querying in SGD
Reducing steady-state variance under Polyak-Lojasiewicz conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Strategic Gradient Querying for finite-sum optimization
Oracle Gradient Querying with expected improvement selection
Single query per iteration with improved performance
🔎 Similar Papers
No similar papers found.