Axial Neural Networks for Dimension-Free Foundation Models

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In physics-informed modeling, solutions to partial differential equations (PDEs) naturally manifest as tensors of variable dimensionality; however, conventional foundation models require fixed-dimensional inputs or dimension-specific encoders, severely limiting generalization and efficiency. To address this, we propose the Axial Neural Network (XNN), a dimension-agnostic unified architecture that leverages Deep Sets and graph neural networks to establish parameter-sharing mechanisms, effectively “axializing” existing PDE solvers. XNN enables zero-shot cross-dimensional inference and in-context learning without architectural modification. It integrates seamlessly across the full pipeline—from pretraining to fine-tuning. Experiments demonstrate that XNN matches the performance of original models on seen dimensions while achieving substantial generalization gains on unseen dimensions. These results underscore the critical importance of multi-dimensional joint pretraining for physics foundation models.

Technology Category

Application Category

📝 Abstract
The advent of foundation models in AI has significantly advanced general-purpose learning, enabling remarkable capabilities in zero-shot inference and in-context learning. However, training such models on physics data, including solutions to partial differential equations (PDEs), poses a unique challenge due to varying dimensionalities across different systems. Traditional approaches either fix a maximum dimension or employ separate encoders for different dimensionalities, resulting in inefficiencies. To address this, we propose a dimension-agnostic neural network architecture, the Axial Neural Network (XNN), inspired by parameter-sharing structures such as Deep Sets and Graph Neural Networks. XNN generalizes across varying tensor dimensions while maintaining computational efficiency. We convert existing PDE foundation models into axial neural networks and evaluate their performance across three training scenarios: training from scratch, pretraining on multiple PDEs, and fine-tuning on a single PDE. Our experiments show that XNNs perform competitively with original models and exhibit superior generalization to unseen dimensions, highlighting the importance of multidimensional pretraining for foundation models.
Problem

Research questions and friction points this paper is trying to address.

Addressing varying dimensionalities in physics data for foundation models
Proposing dimension-agnostic architecture to generalize across tensor dimensions
Enhancing generalization to unseen dimensions through multidimensional pretraining
Innovation

Methods, ideas, or system contributions that make the work stand out.

Axial Neural Network enables dimension-agnostic architecture
Parameter-sharing generalizes across varying tensor dimensions
Multidimensional pretraining enhances generalization to unseen dimensions