🤖 AI Summary
This work studies the complexity of finding joint stationary points in nonconvex simple bilevel optimization problems without structural assumptions such as convexity or the Polyak–Łojasiewicz (PL) condition. For the setting where both upper- and lower-level objectives are smooth but nonconvex, we introduce— for the first time—the notion of joint stationarity applicable to general nonconvex bilevel structures. We propose Dynamic Barrier Gradient Descent (DBGD), a first-order discrete-time algorithm that jointly updates upper- and lower-level variables in a coordinated manner. DBGD converges to an $(varepsilon_f, varepsilon_g)$-joint stationary point within polynomial time, with iteration complexity $mathcal{O}ig(max{varepsilon_f^{-(3+p)/(1+p)},, varepsilon_g^{-(3+p)/2}}ig)$ for any $p geq 0$. This is the first work to establish a joint stationarity complexity bound and provide a practical first-order algorithm for general nonconvex simple bilevel optimization, eliminating the need for additional structural assumptions on the lower-level problem.
📝 Abstract
In this paper, we study the problem of solving a simple bilevel optimization problem, where the upper-level objective is minimized over the solution set of the lower-level problem. We focus on the general setting in which both the upper- and lower-level objectives are smooth but potentially nonconvex. Due to the absence of additional structural assumptions for the lower-level objective-such as convexity or the Polyak-Łojasiewicz (PL) condition-guaranteeing global optimality is generally intractable. Instead, we introduce a suitable notion of stationarity for this class of problems and aim to design a first-order algorithm that finds such stationary points in polynomial time. Intuitively, stationarity in this setting means the upper-level objective cannot be substantially improved locally without causing a larger deterioration in the lower-level objective. To this end, we show that a simple and implementable variant of the dynamic barrier gradient descent (DBGD) framework can effectively solve the considered nonconvex simple bilevel problems up to stationarity. Specifically, to reach an $(ε_f, ε_g)$-stationary point-where $ε_f$ and $ε_g$ denote the target stationarity accuracies for the upper- and lower-level objectives, respectively-the considered method achieves a complexity of $mathcal{O}left(maxleft(ε_f^{-frac{3+p}{1+p}}, ε_g^{-frac{3+p}{2}}
ight)
ight)$, where $p geq 0$ is an arbitrary constant balancing the terms. To the best of our knowledge, this is the first complexity result for a discrete-time algorithm that guarantees joint stationarity for both levels in general nonconvex simple bilevel problems.