🤖 AI Summary
This work addresses the challenge of modeling diverse and complex solution operators across multiple classes of partial differential equations (PDEs), which exhibit significant structural heterogeneity. To this end, the authors propose an input-dependent adaptive operator transformation mechanism that maps heterogeneous solution operators into a unified, aligned, and simplified form. This is achieved through parallel flow expansion, adaptive feature redistribution, and the generation of doubly stochastic matrices via Sinkhorn projection. The method introduces only 3% additional parameters yet achieves state-of-the-art performance across 12 PDE benchmarks, reducing the average L2 error by 40.9%. Furthermore, after fine-tuning, it attains up to 92% and 89% error reduction in in-domain and out-of-domain scenarios, respectively.
📝 Abstract
Pre-training neural operators on diverse partial differential equation (PDE) datasets has emerged as a promising direction for building general-purpose surrogate models in scientific machine learning. However, the inherent complexity and structural diversity of PDE solution operators make multi-PDE pre-training fundamentally challenging. Existing methods mainly address this by increasing model capacity, while leaving the target solution operators unchanged. Inspired by classical numerical analysis, we instead propose to transform complex and diverse solution operators into simpler, better-aligned forms that are easier to model jointly. Since the optimal transformation varies across PDE types, it must be adaptive and input-dependent, allowing a single neural operator to approximate an entire family of operators. We instantiate this idea as AOT-POT (adaptive operator-transformation for pre-training operator transformer), which expands hidden representations into multiple parallel streams, adaptively aggregates and redistributes them before and after each sub-layer, and mixes streams through Sinkhorn-projected doubly stochastic matrices for stable training. These mechanisms together reshape diverse solution operators into a unified form that can be effectively modeled by a single architecture. Empirically, AOT-POT achieves state-of-the-art performance on 12 PDE benchmarks with only 3\% additional parameters, reducing relative L2 error by up to 77.6\% (40.9\% on average). Fine-tuning AOT-POT further reduces L2 error by up to 92\% on in-domain PDEs and 89\% on out-of-domain PDEs (unseen types during pre-training), demonstrating that adaptive operator transformation is an effective and complementary direction for advancing PDE foundation models beyond simply scaling model capacity.