🤖 AI Summary
This work addresses the limitations of existing single-loop stochastic bilevel optimization algorithms, which lack explicit characterization of their dependence on the lower-level condition number κ and exhibit suboptimal convergence rates. We study the Single-loop Stochastic Approximate Implicit Differentiation (SSAID) algorithm and, for the first time in nonconvex stochastic bilevel optimization, establish its explicit dependence on κ. Through a refined complexity analysis, we prove that SSAID converges to an ε-stationary point with an oracle complexity of 𝒪(κ⁷ε⁻²). This result matches the optimal ε⁻² convergence rate achieved by multi-loop methods while preserving the computational efficiency of a single-loop structure.
📝 Abstract
Stochastic Bilevel Optimization has emerged as a fundamental framework for meta-learning and hyperparameter optimization. Despite the practical prevalence of single-loop algorithms--which update lower and upper variables concurrently--their theoretical understanding, particularly in the stochastic regime, remains significantly underdeveloped compared to their multi-loop counterparts. Existing analyses often yield suboptimal convergence rates or obscure the critical dependence on the lower-level condition number $\kappa$, frequently burying it within generic Lipschitz constants. In this paper, we bridge this gap by providing a refined convergence analysis of the Single-loop Stochastic Approximate Implicit Differentiation (SSAID) algorithm. We prove that SSAID achieves an $\epsilon$-stationary point with an oracle complexity of $\mathcal{O}(\kappa^7 \epsilon^{-2})$. Our result is noteworthy in two aspects: (i) it matches the optimal $\mathcal{O}(\epsilon^{-2})$ rate of state-of-the-art multi-loop methods (e.g., stocBiO) while maintaining the computational efficiency of a single-loop update; and (ii) it provides the first explicit, fine-grained characterization of the $\kappa$-dependence for stochastic AID-based single-loop methods. This work demonstrates that SSAID is not merely a heuristic approach, but admits a rigorous theoretical foundation with convergence guarantees competitive with mainstream multi-loop frameworks.