🤖 AI Summary
This work addresses the lack of finite-time convergence guarantees in zeroth-order multi-timescale stochastic optimization by studying two-timescale gradient and three-timescale Newton methods that rely solely on function-value feedback. By employing smoothing functionals to estimate gradients and Hessians, the paper establishes the first non-asymptotic convergence analysis for zeroth-order multi-timescale algorithms, explicitly characterizing the coupling between timescales and the associated error propagation mechanisms. Key contributions include deriving mean-squared error bounds for Hessian estimation, providing a finite-time upper bound on the norm of the objective gradient, proving convergence to a first-order stationary point, and proposing a stepsize strategy that balances dominant error sources to achieve a near-optimal convergence rate. The theoretical findings are validated in the Continuous Mountain Car environment.
📝 Abstract
We present a finite-time analysis of two smoothed functional stochastic approximation algorithms for simulation-based optimization. The first is a two time-scale gradient-based method, while the second is a three time-scale Newton-based algorithm that estimates both the gradient and the Hessian of the objective function $J$. Both algorithms involve zeroth order estimates for the gradient/Hessian. Although the asymptotic convergence of these algorithms has been established in prior work, finite-time guarantees of two-timescale stochastic optimization algorithms in zeroth order settings have not been provided previously. For our Newton algorithm, we derive mean-squared error bounds for the Hessian estimator and establish a finite-time bound on $\min\limits_{0 \le m \le T} \mathbb{E}\| \nabla J(θ(m)) \|^2$, showing convergence to first-order stationary points. The analysis explicitly characterizes the interaction between multiple time-scales and the propagation of estimation errors. We further identify step-size choices that balance dominant error terms and achieve near-optimal convergence rates. We also provide corresponding finite-time guarantees for the gradient algorithm under the same framework. The theoretical results are further validated through experiments on the Continuous Mountain Car environment.