🤖 AI Summary
This work proposes a modular mixture-of-experts architecture for solving partial differential equations, addressing the limited generalization and reliance on fixed temporal discretizations of existing neural operators. By integrating an operator splitting strategy, the approach explicitly encodes physical structure: nonlinear physics components are learned via neural operators, while linear parts are approximated using fixed finite-difference convolutions. The entire framework is embedded within a Neural ODE formulation, enabling continuous-time prediction. This design yields superior out-of-distribution generalization, higher parameter efficiency, and faster convergence on both incompressible and compressible Navier–Stokes equations. Notably, the method supports temporal extrapolation beyond the training time horizon, demonstrating robust predictive capability in unseen regimes.
📝 Abstract
Neural operators have emerged as promising surrogate models for solving partial differential equations (PDEs), but struggle to generalise beyond training distributions and are often constrained to a fixed temporal discretisation. This work introduces a physics-informed training framework that addresses these limitations by decomposing PDEs using operator splitting methods, training separate neural operators to learn individual non-linear physical operators while approximating linear operators with fixed finite-difference convolutions. This modular mixture-of-experts architecture enables generalisation to novel physical regimes by explicitly encoding the underlying operator structure. We formulate the modelling task as a neural ordinary differential equation (ODE) where these learned operators constitute the right-hand side, enabling continuous-in-time predictions through standard ODE solvers and implicitly enforcing PDE constraints. Demonstrated on incompressible and compressible Navier-Stokes equations, our approach achieves better convergence and superior performance when generalising to unseen physics. The method remains parameter-efficient, enabling temporal extrapolation beyond training horizons, and provides interpretable components whose behaviour can be verified against known physics.