On the Complexity of Finding Stationary Points in Nonconvex Simple Bilevel Optimization

📅 2025-07-30

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This work studies the complexity of finding joint stationary points in nonconvex simple bilevel optimization problems without structural assumptions such as convexity or the Polyak–Łojasiewicz (PL) condition. For the setting where both upper- and lower-level objectives are smooth but nonconvex, we introduce— for the first time—the notion of joint stationarity applicable to general nonconvex bilevel structures. We propose Dynamic Barrier Gradient Descent (DBGD), a first-order discrete-time algorithm that jointly updates upper- and lower-level variables in a coordinated manner. DBGD converges to an $(varepsilon_f, varepsilon_g)$-joint stationary point within polynomial time, with iteration complexity $mathcal{O}ig(max{varepsilon_f^{-(3+p)/(1+p)},, varepsilon_g^{-(3+p)/2}}ig)$ for any $p geq 0$. This is the first work to establish a joint stationarity complexity bound and provide a practical first-order algorithm for general nonconvex simple bilevel optimization, eliminating the need for additional structural assumptions on the lower-level problem.

Technology Category

Application Category

📝 Abstract

In this paper, we study the problem of solving a simple bilevel optimization problem, where the upper-level objective is minimized over the solution set of the lower-level problem. We focus on the general setting in which both the upper- and lower-level objectives are smooth but potentially nonconvex. Due to the absence of additional structural assumptions for the lower-level objective-such as convexity or the Polyak-Łojasiewicz (PL) condition-guaranteeing global optimality is generally intractable. Instead, we introduce a suitable notion of stationarity for this class of problems and aim to design a first-order algorithm that finds such stationary points in polynomial time. Intuitively, stationarity in this setting means the upper-level objective cannot be substantially improved locally without causing a larger deterioration in the lower-level objective. To this end, we show that a simple and implementable variant of the dynamic barrier gradient descent (DBGD) framework can effectively solve the considered nonconvex simple bilevel problems up to stationarity. Specifically, to reach an $(ε_f, ε_g)$-stationary point-where $ε_f$ and $ε_g$ denote the target stationarity accuracies for the upper- and lower-level objectives, respectively-the considered method achieves a complexity of $mathcal{O}left(maxleft(ε_f^{-frac{3+p}{1+p}}, ε_g^{-frac{3+p}{2}} ight) ight)$, where $p geq 0$ is an arbitrary constant balancing the terms. To the best of our knowledge, this is the first complexity result for a discrete-time algorithm that guarantees joint stationarity for both levels in general nonconvex simple bilevel problems.

Problem

Research questions and friction points this paper is trying to address.

Finding stationary points in nonconvex bilevel optimization problems

Designing first-order algorithm for polynomial-time stationarity

Analyzing complexity for joint upper-lower level stationarity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces stationarity notion for nonconvex bilevel optimization

Uses dynamic barrier gradient descent framework

Achieves polynomial-time complexity for joint stationarity

🔎 Similar Papers

On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis