🤖 AI Summary
Automated hyperparameter learning in variational regularization—particularly within bilevel optimization frameworks—is hindered by the inability to compute exact objective values and hypergradients, preventing reliable step-size selection due to unknown Lipschitz constants.
Method: We propose an adaptive bilevel optimization method that requires neither exact function evaluations nor gradient computations.
Contribution/Results: Our approach introduces (1) the first adaptive accuracy control mechanism based on inexact hypergradients, and (2) a coupled backtracking line search that jointly adapts computational accuracy and step size—eliminating dependence on pre-specified Lipschitz constants or manual tuning. We validate the method on total variation denoising, Field of Experts, and multinomial logistic regression, demonstrating high efficiency, robustness, and weak sensitivity to initial accuracy and step-size choices. This provides a reliable, practical paradigm for hyperparameter estimation in imaging and data science.
📝 Abstract
Various tasks in data science are modeled utilizing the variational regularization approach, where manually selecting regularization parameters presents a challenge. The difficulty gets exacerbated when employing regularizers involving a large number of hyperparameters. To overcome this challenge, bilevel learning can be employed to learn such parameters from data. However, neither exact function values nor exact gradients with respect to the hyperparameters are attainable, necessitating methods that only rely on inexact evaluation of such quantities. State-of-the-art inexact gradient-based methods a priori select a sequence of the required accuracies and cannot identify an appropriate step size since the Lipschitz constant of the hypergradient is unknown. In this work, we propose an algorithm with backtracking line search that only relies on inexact function evaluations and hypergradients and show convergence to a stationary point. Furthermore, the proposed algorithm determines the required accuracy dynamically rather than manually selected before running it. Our numerical experiments demonstrate the efficiency and feasibility of our approach for hyperparameter estimation on a range of relevant problems in imaging and data science such as total variation and field of experts denoising and multinomial logistic regression. Particularly, the results show that the algorithm is robust to its own hyperparameters such as the initial accuracies and step size.