🤖 AI Summary
This work addresses the challenge of finite-sample error analysis for two-timescale stochastic approximation algorithms. We establish the first non-asymptotic central limit theorem (CLT) for Polyak–Ruppert averaged estimators, quantifying their statistical behavior under the Wasserstein-1 distance. Our analysis integrates two-timescale iteration dynamics, martingale-difference noise modeling, and refined probabilistic inequalities to derive a tight bound on the expected estimation error. Crucially, we prove convergence at the optimal $1/sqrt{n}$ rate—surpassing the suboptimal rates achieved by prior analyses. This result yields the first sharp, verifiable finite-time error bound for linear two-timescale algorithms. It significantly advances theoretical interpretability and sample-efficiency guarantees in machine learning applications, particularly in reinforcement learning and distributed optimization.
📝 Abstract
We consider linear two-time-scale stochastic approximation algorithms driven by martingale noise. Recent applications in machine learning motivate the need to understand finite-time error rates, but conventional stochastic approximation analysis focus on either asymptotic convergence in distribution or finite-time bounds that are far from optimal. Prior work on asymptotic central limit theorems (CLTs) suggest that two-time-scale algorithms may be able to achieve $1/sqrt{n}$ error in expectation, with a constant given by the expected norm of the limiting Gaussian vector. However, the best known finite-time rates are much slower. We derive the first non-asymptotic central limit theorem with respect to the Wasserstein-1 distance for two-time-scale stochastic approximation with Polyak-Ruppert averaging. As a corollary, we show that expected error achieved by Polyak-Ruppert averaging decays at rate $1/sqrt{n}$, which significantly improves on the rates of convergence in prior works.