🤖 AI Summary
Existing decentralized bilevel optimization methods rely on problem-specific hyperparameters—such as smoothness constants, convexity parameters, and network spectral properties—that are typically unknown a priori, necessitating extensive manual tuning. To address this, we propose AdaSDBO, the first fully parameter-free single-loop algorithm for decentralized bilevel optimization. AdaSDBO adaptively adjusts step sizes for all variables solely based on cumulative gradient norms, requiring no problem-dependent hyperparameter specification and supporting nonconvex upper-level and non-strongly-convex lower-level objectives. We establish its $widetilde{mathcal{O}}(1/T)$ convergence rate under standard decentralized settings, matching the theoretical performance of optimally tuned methods. Empirical results demonstrate exceptional robustness to step-size configurations and consistent competitiveness across diverse tasks, validating its practical efficacy and generalizability.
📝 Abstract
Decentralized bilevel optimization has garnered significant attention due to its critical role in solving large-scale machine learning problems. However, existing methods often rely on prior knowledge of problem parameters-such as smoothness, convexity, or communication network topologies-to determine appropriate stepsizes. In practice, these problem parameters are typically unavailable, leading to substantial manual effort for hyperparameter tuning. In this paper, we propose AdaSDBO, a fully problem-parameter-free algorithm for decentralized bilevel optimization with a single-loop structure. AdaSDBO leverages adaptive stepsizes based on cumulative gradient norms to update all variables simultaneously, dynamically adjusting its progress and eliminating the need for problem-specific hyperparameter tuning. Through rigorous theoretical analysis, we establish that AdaSDBO achieves a convergence rate of $widetilde{mathcal{O}}left(frac{1}{T}
ight)$, matching the performance of well-tuned state-of-the-art methods up to polylogarithmic factors. Extensive numerical experiments demonstrate that AdaSDBO delivers competitive performance compared to existing decentralized bilevel optimization methods while exhibiting remarkable robustness across diverse stepsize configurations.