Stochastic Gradient Descent with Strategic Querying

📅 2025-08-23

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This paper addresses the inefficiency of uniform gradient sampling in stochastic gradient descent (SGD). We propose Strategic Gradient Querying (SGQ), a method that, at each iteration, dynamically selects the gradient sample most likely to reduce the objective function—guided by the Expected Improvement criterion—rather than sampling uniformly. Under standard Polyak–Łojasiewicz and smoothness assumptions, theoretical analysis shows SGQ achieves faster transient convergence and tighter steady-state variance control than vanilla SGD; we further introduce an idealized Oracle query as a performance upper bound. Crucially, SGQ requires only a single gradient evaluation per iteration, ensuring practicality. Extensive experiments across diverse optimization tasks demonstrate that SGQ significantly accelerates convergence and consistently reduces gradient estimation variance compared to baseline SGD and several adaptive variants.

Technology Category

Application Category

📝 Abstract

This paper considers a finite-sum optimization problem under first-order queries and investigates the benefits of strategic querying on stochastic gradient-based methods compared to uniform querying strategy. We first introduce Oracle Gradient Querying (OGQ), an idealized algorithm that selects one user's gradient yielding the largest possible expected improvement (EI) at each step. However, OGQ assumes oracle access to the gradients of all users to make such a selection, which is impractical in real-world scenarios. To address this limitation, we propose Strategic Gradient Querying (SGQ), a practical algorithm that has better transient-state performance than SGD while making only one query per iteration. For smooth objective functions satisfying the Polyak-Lojasiewicz condition, we show that under the assumption of EI heterogeneity, OGQ enhances transient-state performance and reduces steady-state variance, while SGQ improves transient-state performance over SGD. Our numerical experiments validate our theoretical findings.

Problem

Research questions and friction points this paper is trying to address.

Optimizing finite-sum problems with strategic gradient queries

Improving transient performance over uniform querying in SGD

Reducing steady-state variance under Polyak-Lojasiewicz conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Strategic Gradient Querying for finite-sum optimization

Oracle Gradient Querying with expected improvement selection

Single query per iteration with improved performance

🔎 Similar Papers

Multiple importance sampling for stochastic gradient estimation