Multi-Level Monte Carlo Training of Neural Operators

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Training neural operators for high-resolution partial differential equation (PDE) operators incurs prohibitive computational cost, making it challenging to simultaneously achieve high accuracy and efficiency. Method: This work introduces the multilevel Monte Carlo (MLMC) method into neural operator training for the first time. By constructing a hierarchical function discretization across multiple resolutions and incorporating a fine-grained gradient correction mechanism, the approach significantly reduces the number of required training samples and associated computational overhead for high-accuracy training. The method is architecture-agnostic—compatible with Fourier Neural Operator, DeepONet, Graph Neural Operator, and other mainstream frameworks—without requiring structural modifications. Results: Extensive evaluation on state-of-the-art models and standard PDE benchmarks demonstrates up to 3.2× speedup in training time at equivalent accuracy. It establishes, for the first time, the Pareto-optimal trade-off curve between accuracy and training time in neural operator training, while maintaining scalability and generalizability.

Technology Category

Application Category

📝 Abstract
Operator learning is a rapidly growing field that aims to approximate nonlinear operators related to partial differential equations (PDEs) using neural operators. These rely on discretization of input and output functions and are, usually, expensive to train for large-scale problems at high-resolution. Motivated by this, we present a Multi-Level Monte Carlo (MLMC) approach to train neural operators by leveraging a hierarchy of resolutions of function dicretization. Our framework relies on using gradient corrections from fewer samples of fine-resolution data to decrease the computational cost of training while maintaining a high level accuracy. The proposed MLMC training procedure can be applied to any architecture accepting multi-resolution data. Our numerical experiments on a range of state-of-the-art models and test-cases demonstrate improved computational efficiency compared to traditional single-resolution training approaches, and highlight the existence of a Pareto curve between accuracy and computational time, related to the number of samples per resolution.
Problem

Research questions and friction points this paper is trying to address.

Reduce training cost for high-resolution neural operators
Improve efficiency in approximating PDE-related nonlinear operators
Balance accuracy and computational time in multi-resolution training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Level Monte Carlo for neural operator training
Leverages gradient corrections from fine-resolution samples
Reduces cost while maintaining high accuracy
🔎 Similar Papers
No similar papers found.
James Rowbottom
James Rowbottom
University of Cambridge
Geometric Deep LearningDynamical SystemsMachine Learning
Stefania Fresca
Stefania Fresca
MOX - Dipartimento di Matematica, Politecnico di Milano
Scientific Machine LearningReduced Order Modeling (Dimensionality Reduction)Numerical Analysis
P
Pietro Lio
Department of Computer Science and Technology, University of Cambridge
C
Carola-Bibiane Schonlieb
Department of Applied Mathematics and Theoretical Physics, University of Cambridge
N
Nicolas Boull'e
Department of Mathematics, Imperial College London