Two-level overlapping additive Schwarz preconditioner for training scientific machine learning applications

📅 2024-06-16
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
To address slow convergence and limited accuracy in scientific machine learning—particularly physics-informed neural networks (PINNs) and operator learning—this paper proposes a novel two-level overlapping Schwarz preconditioner. The method innovatively integrates nonlinear domain decomposition into ML optimization: it partitions network parameters across overlapping subdomains, performs synchronized subdomain updates, and employs a coarse-level parameter coordination mechanism that implicitly incorporates the network’s forward structure. Coupled with model parallelism and the L-BFGS optimizer, the framework enables efficient distributed training. Experiments demonstrate that the approach accelerates convergence by 2–5×, improves prediction accuracy, and reduces both communication and computational overhead in large-scale training. This work establishes a scalable, high-performance optimization paradigm for scientific AI.

Technology Category

Application Category

📝 Abstract
We introduce a novel two-level overlapping additive Schwarz preconditioner for accelerating the training of scientific machine learning applications. The design of the proposed preconditioner is motivated by the nonlinear two-level overlapping additive Schwarz preconditioner. The neural network parameters are decomposed into groups (subdomains) with overlapping regions. In addition, the network's feed-forward structure is indirectly imposed through a novel subdomain-wise synchronization strategy and a coarse-level training step. Through a series of numerical experiments, which consider physics-informed neural networks and operator learning approaches, we demonstrate that the proposed two-level preconditioner significantly speeds up the convergence of the standard (LBFGS) optimizer while also yielding more accurate machine learning models. Moreover, the devised preconditioner is designed to take advantage of model-parallel computations, which can further reduce the training time.
Problem

Research questions and friction points this paper is trying to address.

Accelerating neural network training for scientific applications
Improving convergence speed and model accuracy simultaneously
Enabling efficient model-parallel computations during training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-level overlapping additive Schwarz preconditioner
Subdomain-wise synchronization strategy
Coarse-level training step
Youngkyu Lee
Youngkyu Lee
Brown University
Computational mathematicsParallel computationNeural networkMachine learning
A
Alena Kopanicáková
Division of Applied Mathematics, Brown University, Providence, USA
G
G. Karniadakis
Division of Applied Mathematics, Brown University, Providence, USA