🤖 AI Summary
This work addresses the long-standing open problem of online estimation of the optimal limiting covariance matrix in nonsmooth, nonconvex (i.e., nonmonotone) stochastic approximation (SA). We propose the first recursive, online batch-mean covariance estimator that requires no prespecified sample size. Grounded in nonsmooth variational inclusion theory and asymptotic normality analysis, the estimator achieves asymptotic consistency under weak assumptions—namely, that the target mapping is locally Lipschitz and its solution set is isolated. We establish a convergence rate of $O(d^{1/2} n^{-1/8+varepsilon})$, the fastest known for nonsmooth nonconvex SA and matching the rate of first-order methods in smooth strongly convex settings. The estimator enables asymptotically valid statistical inference, including confidence interval construction and hypothesis testing, without requiring prior knowledge of problem-specific constants or tuning parameters.
📝 Abstract
We consider applying stochastic approximation (SA) methods to solve nonsmooth variational inclusion problems. Existing studies have shown that the averaged iterates of SA methods exhibit asymptotic normality, with an optimal limiting covariance matrix in the local minimax sense of H'ajek and Le Cam. However, no methods have been proposed to estimate this covariance matrix in a nonsmooth and potentially non-monotone (nonconvex) setting. In this paper, we study an online batch-means covariance matrix estimator introduced in Zhu et al.(2023). The estimator groups the SA iterates appropriately and computes the sample covariance among batches as an estimate of the limiting covariance. Its construction does not require prior knowledge of the total sample size, and updates can be performed recursively as new data arrives. We establish that, as long as the batch size sequence is properly specified (depending on the stepsize sequence), the estimator achieves a convergence rate of order $O(sqrt{d}n^{-1/8+varepsilon})$ for any $varepsilon>0$, where $d$ and $n$ denote the problem dimensionality and the number of iterations (or samples) used. Although the problem is nonsmooth and potentially non-monotone (nonconvex), our convergence rate matches the best-known rate for covariance estimation methods using only first-order information in smooth and strongly-convex settings. The consistency of this covariance estimator enables asymptotically valid statistical inference, including constructing confidence intervals and performing hypothesis testing.