Finite-time analysis of Multi-timescale Stochastic Optimization Algorithms

📅 2026-03-31

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the lack of finite-time convergence guarantees in zeroth-order multi-timescale stochastic optimization by studying two-timescale gradient and three-timescale Newton methods that rely solely on function-value feedback. By employing smoothing functionals to estimate gradients and Hessians, the paper establishes the first non-asymptotic convergence analysis for zeroth-order multi-timescale algorithms, explicitly characterizing the coupling between timescales and the associated error propagation mechanisms. Key contributions include deriving mean-squared error bounds for Hessian estimation, providing a finite-time upper bound on the norm of the objective gradient, proving convergence to a first-order stationary point, and proposing a stepsize strategy that balances dominant error sources to achieve a near-optimal convergence rate. The theoretical findings are validated in the Continuous Mountain Car environment.

Technology Category

Application Category

📝 Abstract

We present a finite-time analysis of two smoothed functional stochastic approximation algorithms for simulation-based optimization. The first is a two time-scale gradient-based method, while the second is a three time-scale Newton-based algorithm that estimates both the gradient and the Hessian of the objective function $J$. Both algorithms involve zeroth order estimates for the gradient/Hessian. Although the asymptotic convergence of these algorithms has been established in prior work, finite-time guarantees of two-timescale stochastic optimization algorithms in zeroth order settings have not been provided previously. For our Newton algorithm, we derive mean-squared error bounds for the Hessian estimator and establish a finite-time bound on $\min\limits_{0 \le m \le T} \mathbb{E}\| \nabla J(θ(m)) \|^2$, showing convergence to first-order stationary points. The analysis explicitly characterizes the interaction between multiple time-scales and the propagation of estimation errors. We further identify step-size choices that balance dominant error terms and achieve near-optimal convergence rates. We also provide corresponding finite-time guarantees for the gradient algorithm under the same framework. The theoretical results are further validated through experiments on the Continuous Mountain Car environment.

Problem

Research questions and friction points this paper is trying to address.

finite-time analysis

multi-timescale stochastic optimization

zeroth-order optimization

Hessian estimation

first-order stationary points

Innovation

Methods, ideas, or system contributions that make the work stand out.

finite-time analysis

multi-timescale stochastic optimization

zeroth-order optimization