An Adaptively Inexact Method for Bilevel Learning Using Primal-Dual Style Differentiation

📅 2024-12-09
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the approximation errors in bilevel optimization arising from numerically solving the lower-level convex optimization problem, which compromises the accuracy of the upper-level loss and hypergradients. To balance precision and efficiency, we propose a novel optimization framework comprising: (1) a posteriori error-bound-based criterion for controlling lower-level solution accuracy and terminating piggyback iterations; (2) an adaptive step-size strategy to improve upper-level convergence; and (3) a unified integration of primal-dual-style differentiation, piggyback algorithmics, adaptive gradient optimization, and input-convex neural network (ICNN) regularization. Evaluated on learning-based regularizer tasks, our method significantly enhances both stability and convergence speed of bilevel optimization, while preserving theoretical interpretability and improving computational efficiency.

Technology Category

Application Category

📝 Abstract
We consider a bilevel learning framework for learning linear operators. In this framework, the learnable parameters are optimized via a loss function that also depends on the minimizer of a convex optimization problem (denoted lower-level problem). We utilize an iterative algorithm called `piggyback' to compute the gradient of the loss and minimizer of the lower-level problem. Given that the lower-level problem is solved numerically, the loss function and thus its gradient can only be computed inexactly. To estimate the accuracy of the computed hypergradient, we derive an a-posteriori error bound, which provides guides for setting the tolerance for the lower-level problem, as well as the piggyback algorithm. To efficiently solve the upper-level optimization, we also propose an adaptive method for choosing a suitable step-size. To illustrate the proposed method, we consider a few learned regularizer problems, such as training an input-convex neural network.
Problem

Research questions and friction points this paper is trying to address.

Learning linear operators via bilevel optimization framework
Computing inexact gradients using primal-dual differentiation
Adaptive step-size for upper-level optimization efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses primal-dual style differentiation
Employs adaptive inexact hypergradient computation
Proposes adaptive step-size upper-level optimization
L
L. Bogensperger
Department of Quantitative Biomedicine, University of Zurich, 8057, Zurich, Switzerland.
M
Matthias Joachim Ehrhardt
Department of Mathematical Sciences, University of Bath, BA2 7AY, Bath, UK.
Thomas Pock
Thomas Pock
Professor of Computer Science, TU Graz
Convex optimizationImage processingVariational methodsMachine learningComputer vision
M
Mohammad Sadegh Salehi
Department of Mathematical Sciences, University of Bath, BA2 7AY, Bath, UK.
H
Hok Shing Wong
Department of Mathematical Sciences, University of Bath, BA2 7AY, Bath, UK.