🤖 AI Summary
In physics-informed modeling, solutions to partial differential equations (PDEs) naturally manifest as tensors of variable dimensionality; however, conventional foundation models require fixed-dimensional inputs or dimension-specific encoders, severely limiting generalization and efficiency. To address this, we propose the Axial Neural Network (XNN), a dimension-agnostic unified architecture that leverages Deep Sets and graph neural networks to establish parameter-sharing mechanisms, effectively “axializing” existing PDE solvers. XNN enables zero-shot cross-dimensional inference and in-context learning without architectural modification. It integrates seamlessly across the full pipeline—from pretraining to fine-tuning. Experiments demonstrate that XNN matches the performance of original models on seen dimensions while achieving substantial generalization gains on unseen dimensions. These results underscore the critical importance of multi-dimensional joint pretraining for physics foundation models.
📝 Abstract
The advent of foundation models in AI has significantly advanced general-purpose learning, enabling remarkable capabilities in zero-shot inference and in-context learning. However, training such models on physics data, including solutions to partial differential equations (PDEs), poses a unique challenge due to varying dimensionalities across different systems. Traditional approaches either fix a maximum dimension or employ separate encoders for different dimensionalities, resulting in inefficiencies. To address this, we propose a dimension-agnostic neural network architecture, the Axial Neural Network (XNN), inspired by parameter-sharing structures such as Deep Sets and Graph Neural Networks. XNN generalizes across varying tensor dimensions while maintaining computational efficiency. We convert existing PDE foundation models into axial neural networks and evaluate their performance across three training scenarios: training from scratch, pretraining on multiple PDEs, and fine-tuning on a single PDE. Our experiments show that XNNs perform competitively with original models and exhibit superior generalization to unseen dimensions, highlighting the importance of multidimensional pretraining for foundation models.