Finite-Sample Wasserstein Error Bounds and Concentration Inequalities for Nonlinear Stochastic Approximation

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the absence of finite-sample error bounds and concentration inequalities for nonlinear stochastic approximation algorithms under the Wasserstein-p distance. By coupling the discrete-time iterative process with its Ornstein–Uhlenbeck diffusion limit, the paper establishes the first non-asymptotic distributional convergence rates in Wasserstein distance under general noise conditions—such as martingale differences and ergodic Markov chains. The main contributions include proving that the last iterate converges to a Gaussian distribution at a rate of γₙ^{1/6}, while the Polyak–Ruppert averaged iterate achieves a rate of n^{-1/6}. Moreover, the analysis yields high-probability concentration inequalities that improve upon those derived via classical moment-based methods. The proposed framework applies broadly to canonical algorithms, including linear stochastic approximation and stochastic gradient descent.

Technology Category

Application Category

📝 Abstract
This paper derives non-asymptotic error bounds for nonlinear stochastic approximation algorithms in the Wasserstein-$p$ distance. To obtain explicit finite-sample guarantees for the last iterate, we develop a coupling argument that compares the discrete-time process to a limiting Ornstein-Uhlenbeck process. Our analysis applies to algorithms driven by general noise conditions, including martingale differences and functions of ergodic Markov chains. Complementing this result, we handle the convergence rate of the Polyak-Ruppert average through a direct analysis that applies under the same general setting. Assuming the driving noise satisfies a non-asymptotic central limit theorem, we show that the normalized last iterates converge to a Gaussian distribution in the $p$-Wasserstein distance at a rate of order $\gamma_n^{1/6}$, where $\gamma_n$ is the step size. Similarly, the Polyak-Ruppert average is shown to converge in the Wasserstein distance at a rate of order $n^{-1/6}$. These distributional guarantees imply high-probability concentration inequalities that improve upon those derived from moment bounds and Markov's inequality. We demonstrate the utility of this approach by considering two applications: (1) linear stochastic approximation, where we explicitly quantify the transition from heavy-tailed to Gaussian behavior of the iterates, thereby bridging the gap between recent finite-sample analyses and asymptotic theory and (2) stochastic gradient descent, where we establish rate of convergence to the central limit theorem.
Problem

Research questions and friction points this paper is trying to address.

nonlinear stochastic approximation
finite-sample error bounds
Wasserstein distance
concentration inequalities
central limit theorem
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wasserstein distance
nonlinear stochastic approximation
finite-sample bounds
coupling argument
Polyak-Ruppert averaging
🔎 Similar Papers
No similar papers found.