An adaptively inexact first-order method for bilevel optimization with application to hyperparameter learning

📅 2023-08-19
📈 Citations: 5
Influential: 1
📄 PDF
🤖 AI Summary
Automated hyperparameter learning in variational regularization—particularly within bilevel optimization frameworks—is hindered by the inability to compute exact objective values and hypergradients, preventing reliable step-size selection due to unknown Lipschitz constants. Method: We propose an adaptive bilevel optimization method that requires neither exact function evaluations nor gradient computations. Contribution/Results: Our approach introduces (1) the first adaptive accuracy control mechanism based on inexact hypergradients, and (2) a coupled backtracking line search that jointly adapts computational accuracy and step size—eliminating dependence on pre-specified Lipschitz constants or manual tuning. We validate the method on total variation denoising, Field of Experts, and multinomial logistic regression, demonstrating high efficiency, robustness, and weak sensitivity to initial accuracy and step-size choices. This provides a reliable, practical paradigm for hyperparameter estimation in imaging and data science.
📝 Abstract
Various tasks in data science are modeled utilizing the variational regularization approach, where manually selecting regularization parameters presents a challenge. The difficulty gets exacerbated when employing regularizers involving a large number of hyperparameters. To overcome this challenge, bilevel learning can be employed to learn such parameters from data. However, neither exact function values nor exact gradients with respect to the hyperparameters are attainable, necessitating methods that only rely on inexact evaluation of such quantities. State-of-the-art inexact gradient-based methods a priori select a sequence of the required accuracies and cannot identify an appropriate step size since the Lipschitz constant of the hypergradient is unknown. In this work, we propose an algorithm with backtracking line search that only relies on inexact function evaluations and hypergradients and show convergence to a stationary point. Furthermore, the proposed algorithm determines the required accuracy dynamically rather than manually selected before running it. Our numerical experiments demonstrate the efficiency and feasibility of our approach for hyperparameter estimation on a range of relevant problems in imaging and data science such as total variation and field of experts denoising and multinomial logistic regression. Particularly, the results show that the algorithm is robust to its own hyperparameters such as the initial accuracies and step size.
Problem

Research questions and friction points this paper is trying to address.

Overcoming manual hyperparameter selection in variational regularization
Addressing inexact gradient evaluation in bilevel optimization methods
Developing adaptive accuracy control for hyperparameter learning efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptively inexact first-order bilevel optimization method
Dynamic accuracy selection via backtracking line search
Robust hyperparameter learning without exact gradients
🔎 Similar Papers
No similar papers found.
M
Mohammad Sadegh Salehi
Department of Mathematical Sciences, University of Bath, Bath, BA2 7AY, UK
Subhadip Mukherjee
Subhadip Mukherjee
Assistant Professor, Department of E&ECE, IIT Kharagpur, India
Machine LearningInverse Problems in ImagingOptimization
Lindon Roberts
Lindon Roberts
School of Mathematics and Statistics, University of Sydney, Camperdown NSW 2006, Australia
M
Matthias Joachim Ehrhardt
Department of Mathematical Sciences, University of Bath, Bath, BA2 7AY, UK