🤖 AI Summary
To address the scalability bottleneck in industrial-scale physical simulation—specifically, the challenge of simultaneously capturing fine-grained local interactions and long-range global dependencies among million-scale spatial elements—this paper proposes a multi-scale parallel attention mechanism. Our core innovation is a spatially adaptive partitioning scheme based on sphere trees, which jointly designs intra-block point-level local attention and inter-block global attention for efficient single-GPU parallelization. We further introduce a multi-scale Patch Transformer architecture that explicitly models cross-resolution correlations in physical fields. Evaluated on multiple PDE benchmarks and large-scale aerodynamic simulation datasets, our method achieves state-of-the-art accuracy while reducing memory consumption by 42% and accelerating training by 3.1×. Notably, it is the first approach to enable million-node neural PDE solving on a single GPU.
📝 Abstract
A key scalability challenge in neural solvers for industrial-scale physics simulations is efficiently capturing both fine-grained local interactions and long-range global dependencies across millions of spatial elements. We introduce the Multi-Scale Patch Transformer (MSPT), an architecture that combines local point attention within patches with global attention to coarse patch-level representations. To partition the input domain into spatially-coherent patches, we employ ball trees, which handle irregular geometries efficiently. This dual-scale design enables MSPT to scale to millions of points on a single GPU. We validate our method on standard PDE benchmarks (elasticity, plasticity, fluid dynamics, porous flow) and large-scale aerodynamic datasets (ShapeNet-Car, Ahmed-ML), achieving state-of-the-art accuracy with substantially lower memory footprint and computational cost.