MSPT: Efficient Large-Scale Physical Modeling via Parallelized Multi-Scale Attention

📅 2025-12-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scalability bottleneck in industrial-scale physical simulation—specifically, the challenge of simultaneously capturing fine-grained local interactions and long-range global dependencies among million-scale spatial elements—this paper proposes a multi-scale parallel attention mechanism. Our core innovation is a spatially adaptive partitioning scheme based on sphere trees, which jointly designs intra-block point-level local attention and inter-block global attention for efficient single-GPU parallelization. We further introduce a multi-scale Patch Transformer architecture that explicitly models cross-resolution correlations in physical fields. Evaluated on multiple PDE benchmarks and large-scale aerodynamic simulation datasets, our method achieves state-of-the-art accuracy while reducing memory consumption by 42% and accelerating training by 3.1×. Notably, it is the first approach to enable million-node neural PDE solving on a single GPU.

Technology Category

Application Category

📝 Abstract
A key scalability challenge in neural solvers for industrial-scale physics simulations is efficiently capturing both fine-grained local interactions and long-range global dependencies across millions of spatial elements. We introduce the Multi-Scale Patch Transformer (MSPT), an architecture that combines local point attention within patches with global attention to coarse patch-level representations. To partition the input domain into spatially-coherent patches, we employ ball trees, which handle irregular geometries efficiently. This dual-scale design enables MSPT to scale to millions of points on a single GPU. We validate our method on standard PDE benchmarks (elasticity, plasticity, fluid dynamics, porous flow) and large-scale aerodynamic datasets (ShapeNet-Car, Ahmed-ML), achieving state-of-the-art accuracy with substantially lower memory footprint and computational cost.
Problem

Research questions and friction points this paper is trying to address.

Efficiently capturing local and global dependencies in large-scale physics simulations
Scaling neural solvers to millions of spatial elements on limited hardware
Reducing memory and computational costs while maintaining simulation accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-scale patch attention for local-global interactions
Ball tree partitioning for irregular geometry handling
Single GPU scalability to millions of points
🔎 Similar Papers
No similar papers found.
P
Pedro M. P. Curvo
University of Amsterdam
Jan-Willem van de Meent
Jan-Willem van de Meent
University of Amsterdam
Probabilistic ProgrammingMachine LearningArtificial IntelligenceData Science
M
Maksim Zhdanov
University of Amsterdam