Bilevel optimization for learning hyperparameters: Application to solving PDEs and inverse problems with Gaussian processes

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Kernel methods and neural networks for scientific computing and reasoning tasks—such as PDE solving, inverse problems, and supervised learning—exhibit strong sensitivity to hyperparameter selection. Although bilevel optimization offers theoretical rigor, its nested structure incurs prohibitive computational cost under PDE constraints. Method: We propose an efficient bilevel hyperparameter learning framework that replaces iterative inner-loop optimization with a closed-form Gauss–Newton linearized update, thereby avoiding repeated expensive PDE solves. The framework integrates Gaussian process modeling, gradient-based hyperparameter adaptation, and deep kernel functions. Contribution/Results: Experiments demonstrate substantial improvements in accuracy, robustness, and convergence stability on nonlinear PDEs and high-dimensional inverse problems. The method exhibits favorable scalability with respect to problem dimensionality and PDE complexity, enabling practical deployment in large-scale scientific machine learning applications.

Technology Category

Application Category

📝 Abstract
Methods for solving scientific computing and inference problems, such as kernel- and neural network-based approaches for partial differential equations (PDEs), inverse problems, and supervised learning tasks, depend crucially on the choice of hyperparameters. Specifically, the efficacy of such methods, and in particular their accuracy, stability, and generalization properties, strongly depends on the choice of hyperparameters. While bilevel optimization offers a principled framework for hyperparameter tuning, its nested optimization structure can be computationally demanding, especially in PDE-constrained contexts. In this paper, we propose an efficient strategy for hyperparameter optimization within the bilevel framework by employing a Gauss-Newton linearization of the inner optimization step. Our approach provides closed-form updates, eliminating the need for repeated costly PDE solves. As a result, each iteration of the outer loop reduces to a single linearized PDE solve, followed by explicit gradient-based hyperparameter updates. We demonstrate the effectiveness of the proposed method through Gaussian process models applied to nonlinear PDEs and to PDE inverse problems. Extensive numerical experiments highlight substantial improvements in accuracy and robustness compared to conventional random hyperparameter initialization. In particular, experiments with additive kernels and neural network-parameterized deep kernels demonstrate the method's scalability and effectiveness for high-dimensional hyperparameter optimization.
Problem

Research questions and friction points this paper is trying to address.

Optimizing hyperparameters for scientific computing and inference problems
Reducing computational cost of bilevel optimization in PDE contexts
Improving accuracy and robustness in PDE and inverse problem solutions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses bilevel optimization for hyperparameter learning
Applies Gauss-Newton linearization to inner optimization step
Provides closed-form updates eliminating repeated PDE solves