Variance Reduction and Low Sample Complexity in Stochastic Optimization via Proximal Point Method

📅 2024-02-14
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses stochastic convex composite optimization under a weak noise assumption—namely, only finite variance of stochastic gradients is assumed, without requiring stronger tail conditions (e.g., sub-Gaussianity). We propose a novel stochastic proximal point method that integrates variance reduction with proximal updates and incorporates an efficient iterative solver for the resulting subproblems. Theoretically, under bounded gradient variance, our method achieves high-probability convergence to an $varepsilon$-accurate solution with $O(1/varepsilon)$ sample complexity—strictly improving upon standard SGD-type algorithms. Our key contributions are threefold: (i) eliminating the need for strong noise assumptions prevalent in prior high-probability analyses; (ii) establishing the first low-sample-complexity, high-probability convergence framework for stochastic composite optimization; and (iii) unifying treatment of nonsmooth problem structure and stochastic gradient error within a single algorithmic and analytical framework.

Technology Category

Application Category

📝 Abstract
This paper proposes a stochastic proximal point method to solve a stochastic convex composite optimization problem. High probability results in stochastic optimization typically hinge on restrictive assumptions on the stochastic gradient noise, for example, sub-Gaussian distributions. Assuming only weak conditions such as bounded variance of the stochastic gradient, this paper establishes a low sample complexity to obtain a high probability guarantee on the convergence of the proposed method. Additionally, a notable aspect of this work is the development of a subroutine to solve the proximal subproblem, which also serves as a novel technique for variance reduction.
Problem

Research questions and friction points this paper is trying to address.

Achieving high-probability guarantees under bounded variance assumptions
Developing stochastic proximal point method for variance reduction
Ensuring low sample complexity without restrictive noise conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stochastic proximal point method reduces variance
Probability booster enhances per-iteration reliability
Converges with low sample complexity without restrictions
🔎 Similar Papers
No similar papers found.
J
Jiaming Liang
Goergen Institute for Data Science and Department of Computer Science, University of Rochester, Rochester, NY 14620