π€ AI Summary
This paper addresses inverse problems for partial differential equations (PDEs)βspecifically, joint inference of unknown parameters and functional quantities from sparse, noisy observational data. We propose a bilevel optimization framework: the upper level optimizes PDE parameters subject to hard physics-informed constraints, while the lower level employs a neural operator to learn the local solution operator and its parameter gradients, enabling high-fidelity descent direction estimation. Our key contributions are: (1) the first bilevel local operator learning mechanism, which automatically couples solution approximation with gradient estimation; (2) elimination of hand-crafted data-residual loss trade-offs via end-to-end hard physical constraints; and (3) robustness to sparse and noisy data without hyperparameter tuning. Experiments across diverse PDE inverse problems demonstrate substantial improvements in both parameter and functional estimation accuracy, faster convergence, and enhanced generalization.
π Abstract
We propose a new neural network based method for solving inverse problems for partial differential equations (PDEs) by formulating the PDE inverse problem as a bilevel optimization problem. At the upper level, we minimize the data loss with respect to the PDE parameters. At the lower level, we train a neural network to locally approximate the PDE solution operator in the neighborhood of a given set of PDE parameters, which enables an accurate approximation of the descent direction for the upper level optimization problem. The lower level loss function includes the L2 norms of both the residual and its derivative with respect to the PDE parameters. We apply gradient descent simultaneously on both the upper and lower level optimization problems, leading to an effective and fast algorithm. The method, which we refer to as BiLO (Bilevel Local Operator learning), is also able to efficiently infer unknown functions in the PDEs through the introduction of an auxiliary variable. Through extensive experiments over multiple PDE systems, we demonstrate that our method enforces strong PDE constraints, is robust to sparse and noisy data, and eliminates the need to balance the residual and the data loss, which is inherent to the soft PDE constraints in many existing methods.