Accelerating Single-Pass SGD for Generalized Linear Prediction

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This work addresses generalized linear prediction under a single-pass streaming setting with non-quadratic and inexact modeling assumptions. It introduces, for the first time, a momentum mechanism into stochastic gradient descent, proposing a data-dependent proximal optimization method that incorporates dual momentum to achieve acceleration. The proposed approach resolves an open problem posed by Jain et al., establishing a refined excess risk bound that decomposes into optimization error, minimax statistical error, and higher-order model misspecification error. This is accomplished through fine-grained smoothness analysis and a two-stage outer-loop statistical error characterization. Both theoretical analysis and empirical evaluation demonstrate that the momentum-based acceleration outperforms existing variance-reduction methods.

Technology Category

Application Category

📝 Abstract

We study generalized linear prediction under a streaming setting, where each iteration uses only one fresh data point for a gradient-level update. While momentum is well-established in deterministic optimization, a fundamental open question is whether it can accelerate such single-pass non-quadratic stochastic optimization. We propose the first algorithm that successfully incorporates momentum via a novel data-dependent proximal method, achieving dual-momentum acceleration. Our derived excess risk bound decomposes into three components: an improved optimization error, a minimax optimal statistical error, and a higher-order model-misspecification error. The proof handles mis-specification via a fine-grained stationary analysis of inner updates, while localizing statistical error through a two-phase outer-loop analysis. As a result, we resolve the open problem posed by Jain et al. [2018a] and demonstrate that momentum acceleration is more effective than variance reduction for generalized linear prediction in the streaming setting.

Problem

Research questions and friction points this paper is trying to address.

single-pass SGD

generalized linear prediction

momentum acceleration

streaming setting

stochastic optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

momentum acceleration

single-pass SGD

generalized linear prediction