Multi-Level Monte Carlo Training of Neural Operators

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Training neural operators for high-resolution partial differential equation (PDE) operators incurs prohibitive computational cost, making it challenging to simultaneously achieve high accuracy and efficiency. Method: This work introduces the multilevel Monte Carlo (MLMC) method into neural operator training for the first time. By constructing a hierarchical function discretization across multiple resolutions and incorporating a fine-grained gradient correction mechanism, the approach significantly reduces the number of required training samples and associated computational overhead for high-accuracy training. The method is architecture-agnostic—compatible with Fourier Neural Operator, DeepONet, Graph Neural Operator, and other mainstream frameworks—without requiring structural modifications. Results: Extensive evaluation on state-of-the-art models and standard PDE benchmarks demonstrates up to 3.2× speedup in training time at equivalent accuracy. It establishes, for the first time, the Pareto-optimal trade-off curve between accuracy and training time in neural operator training, while maintaining scalability and generalizability.

Technology Category

Application Category

📝 Abstract

Operator learning is a rapidly growing field that aims to approximate nonlinear operators related to partial differential equations (PDEs) using neural operators. These rely on discretization of input and output functions and are, usually, expensive to train for large-scale problems at high-resolution. Motivated by this, we present a Multi-Level Monte Carlo (MLMC) approach to train neural operators by leveraging a hierarchy of resolutions of function dicretization. Our framework relies on using gradient corrections from fewer samples of fine-resolution data to decrease the computational cost of training while maintaining a high level accuracy. The proposed MLMC training procedure can be applied to any architecture accepting multi-resolution data. Our numerical experiments on a range of state-of-the-art models and test-cases demonstrate improved computational efficiency compared to traditional single-resolution training approaches, and highlight the existence of a Pareto curve between accuracy and computational time, related to the number of samples per resolution.

Problem

Research questions and friction points this paper is trying to address.

Reduce training cost for high-resolution neural operators

Improve efficiency in approximating PDE-related nonlinear operators

Balance accuracy and computational time in multi-resolution training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Level Monte Carlo for neural operator training

Leverages gradient corrections from fine-resolution samples

Reduces cost while maintaining high accuracy

🔎 Similar Papers

Efficient Training of Deep Neural Operator Networks via Randomized Sampling