Axial Neural Networks for Dimension-Free Foundation Models

📅 2025-10-15

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

In physics-informed modeling, solutions to partial differential equations (PDEs) naturally manifest as tensors of variable dimensionality; however, conventional foundation models require fixed-dimensional inputs or dimension-specific encoders, severely limiting generalization and efficiency. To address this, we propose the Axial Neural Network (XNN), a dimension-agnostic unified architecture that leverages Deep Sets and graph neural networks to establish parameter-sharing mechanisms, effectively “axializing” existing PDE solvers. XNN enables zero-shot cross-dimensional inference and in-context learning without architectural modification. It integrates seamlessly across the full pipeline—from pretraining to fine-tuning. Experiments demonstrate that XNN matches the performance of original models on seen dimensions while achieving substantial generalization gains on unseen dimensions. These results underscore the critical importance of multi-dimensional joint pretraining for physics foundation models.

Technology Category

Application Category

📝 Abstract

The advent of foundation models in AI has significantly advanced general-purpose learning, enabling remarkable capabilities in zero-shot inference and in-context learning. However, training such models on physics data, including solutions to partial differential equations (PDEs), poses a unique challenge due to varying dimensionalities across different systems. Traditional approaches either fix a maximum dimension or employ separate encoders for different dimensionalities, resulting in inefficiencies. To address this, we propose a dimension-agnostic neural network architecture, the Axial Neural Network (XNN), inspired by parameter-sharing structures such as Deep Sets and Graph Neural Networks. XNN generalizes across varying tensor dimensions while maintaining computational efficiency. We convert existing PDE foundation models into axial neural networks and evaluate their performance across three training scenarios: training from scratch, pretraining on multiple PDEs, and fine-tuning on a single PDE. Our experiments show that XNNs perform competitively with original models and exhibit superior generalization to unseen dimensions, highlighting the importance of multidimensional pretraining for foundation models.

Problem

Research questions and friction points this paper is trying to address.

Addressing varying dimensionalities in physics data for foundation models

Proposing dimension-agnostic architecture to generalize across tensor dimensions

Enhancing generalization to unseen dimensions through multidimensional pretraining

Innovation

Methods, ideas, or system contributions that make the work stand out.

Axial Neural Network enables dimension-agnostic architecture

Parameter-sharing generalizes across varying tensor dimensions

Multidimensional pretraining enhances generalization to unseen dimensions

🔎 Similar Papers

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey