Problem-Parameter-Free Decentralized Bilevel Optimization

📅 2025-10-28

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

Existing decentralized bilevel optimization methods rely on problem-specific hyperparameters—such as smoothness constants, convexity parameters, and network spectral properties—that are typically unknown a priori, necessitating extensive manual tuning. To address this, we propose AdaSDBO, the first fully parameter-free single-loop algorithm for decentralized bilevel optimization. AdaSDBO adaptively adjusts step sizes for all variables solely based on cumulative gradient norms, requiring no problem-dependent hyperparameter specification and supporting nonconvex upper-level and non-strongly-convex lower-level objectives. We establish its $widetilde{mathcal{O}}(1/T)$ convergence rate under standard decentralized settings, matching the theoretical performance of optimally tuned methods. Empirical results demonstrate exceptional robustness to step-size configurations and consistent competitiveness across diverse tasks, validating its practical efficacy and generalizability.

Technology Category

Application Category

📝 Abstract

Decentralized bilevel optimization has garnered significant attention due to its critical role in solving large-scale machine learning problems. However, existing methods often rely on prior knowledge of problem parameters-such as smoothness, convexity, or communication network topologies-to determine appropriate stepsizes. In practice, these problem parameters are typically unavailable, leading to substantial manual effort for hyperparameter tuning. In this paper, we propose AdaSDBO, a fully problem-parameter-free algorithm for decentralized bilevel optimization with a single-loop structure. AdaSDBO leverages adaptive stepsizes based on cumulative gradient norms to update all variables simultaneously, dynamically adjusting its progress and eliminating the need for problem-specific hyperparameter tuning. Through rigorous theoretical analysis, we establish that AdaSDBO achieves a convergence rate of $widetilde{mathcal{O}}left(frac{1}{T} ight)$, matching the performance of well-tuned state-of-the-art methods up to polylogarithmic factors. Extensive numerical experiments demonstrate that AdaSDBO delivers competitive performance compared to existing decentralized bilevel optimization methods while exhibiting remarkable robustness across diverse stepsize configurations.

Problem

Research questions and friction points this paper is trying to address.

Eliminates need for problem-specific hyperparameter tuning

Adapts stepsizes dynamically using cumulative gradient norms

Achieves convergence without prior knowledge of parameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Problem-parameter-free decentralized bilevel optimization algorithm

Adaptive stepsizes using cumulative gradient norms

Single-loop structure updating all variables simultaneously

🔎 Similar Papers

Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity