qNBO: quasi-Newton Meets Bilevel Optimization

📅 2025-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the long-standing challenge in bilevel optimization where solving the lower-level problem and computing inverse Hessian-vector products are decoupled, leading to high computational overhead. We propose the first bilevel optimization framework that systematically integrates the Broyden–Fletcher–Goldfarb–Shanno (BFGS) quasi-Newton method as a unified surrogate for both lower-level optimization and implicit gradient approximation. By leveraging BFGS’s superlinear convergence, our method enables non-asymptotic convergence analysis while avoiding explicit second-order derivative computations and repeated inner-loop iterations. Evaluated on hyperparameter optimization, data hyper-cleaning, and few-shot meta-learning, our approach matches or surpasses state-of-the-art methods in performance while significantly reducing computational cost—demonstrating the effectiveness and practicality of joint optimization in bilevel learning.

Technology Category

Application Category

📝 Abstract
Bilevel optimization, addressing challenges in hierarchical learning tasks, has gained significant interest in machine learning. The practical implementation of the gradient descent method to bilevel optimization encounters computational hurdles, notably the computation of the exact lower-level solution and the inverse Hessian of the lower-level objective. Although these two aspects are inherently connected, existing methods typically handle them separately by solving the lower-level problem and a linear system for the inverse Hessian-vector product. In this paper, we introduce a general framework to address these computational challenges in a coordinated manner. Specifically, we leverage quasi-Newton algorithms to accelerate the resolution of the lower-level problem while efficiently approximating the inverse Hessian-vector product. Furthermore, by exploiting the superlinear convergence properties of BFGS, we establish the non-asymptotic convergence analysis of the BFGS adaptation within our framework. Numerical experiments demonstrate the comparable or superior performance of the proposed algorithms in real-world learning tasks, including hyperparameter optimization, data hyper-cleaning, and few-shot meta-learning.
Problem

Research questions and friction points this paper is trying to address.

Bilevel Optimization
Solution Accuracy
Inverse Hessian
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-level Optimization
Quasi-Newton Methods
Inverse Hessian Approximation
🔎 Similar Papers
No similar papers found.
Sheng Fang
Sheng Fang
Beijing Normal University
Critical Phenomena
Yong-Jin Liu
Yong-Jin Liu
Professor of College of Mathematics and Computer Science at Fuzhou University
Mathematical ProgrammingStatistical OptimizationNumerical Computation
W
Wei Yao
National Center for Applied Mathematics Shenzhen, SUSTech; Department of Mathematics, SUSTech
Chengming Yu
Chengming Yu
National Center for Applied Mathematics Shenzhen, SUSTech; School of Science, BUPT
J
Jin Zhang
National Center for Applied Mathematics Shenzhen, SUSTech; CETC Key Laboratory of Smart City Modeling Simulation and Intelligent Technology, The Smart City Research Institute of CETC