Factor Augmented High-Dimensional SGD

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses the inefficiency of standard stochastic gradient descent (SGD) in high-dimensional streaming settings by proposing Factor-Augmented SGD (FSGD), which integrates latent factor representations into the optimization process to enable end-to-end scalable learning. FSGD is the first method to incorporate latent factor estimation error directly into the theoretical analysis framework of SGD, thereby circumventing the need for full-data storage inherent in conventional two-stage dimensionality reduction approaches. By leveraging streaming processing, decaying step sizes, and mini-batch updates, the authors establish moment convergence guarantees for FSGD under ℓ^s norms. This provides a theoretically grounded and scalable optimization framework for high-dimensional machine learning tasks.

📝 Abstract

Stochastic gradient descent (SGD) is a fundamental optimization algorithm widely used in modern machine learning. In this paper, we propose Factor-Augmented SGD (FSGD), a new optimization method that leverages latent factor representations in high-dimensional learning tasks. Unlike standard two-stage dimension reduction approaches that rely on offline representation learning and full data storage, a key novelty of FSGD is that it operates purely on streaming data, making it scalable to large-scale and high-dimensional problems. Furthermore, we establish the first theoretical framework that explicitly incorporates latent factor estimation error into the analysis of SGD, and provide moment convergence in $\ell^s$ norm under decaying step sizes and mini-batch updates. Our results provide a new foundation for employing SGD reliably and scalably in high-dimensional machine learning systems.

Problem

Research questions and friction points this paper is trying to address.

high-dimensional SGD

latent factors

streaming data

factor estimation error

scalable optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Factor-Augmented SGD

streaming data

latent factor estimation