Learning Deformable Body Interactions With Adaptive Spatial Tokenization

📅 2025-07-18

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Graph Neural Networks (GNNs) for deformable-body interaction simulation suffer from high computational complexity and poor scalability to large-scale meshes due to dynamic construction of global edges. Method: We propose an adaptive spatial tokenization framework for deformable-body interaction modeling. Our approach introduces an adaptive spatial tokenization mechanism that maps unstructured meshes onto structured grids and generates compact state tokens; integrates cross-attention and self-attention to efficiently model physical evolution in latent space; and employs dynamic neighborhood aggregation for end-to-end learned simulation. Contribution/Results: The method achieves high accuracy and efficiency on meshes exceeding 100,000 nodes, significantly outperforming existing state-of-the-art methods. Furthermore, we release the first large-scale benchmark dataset covering diverse deformable-body interaction scenarios—enabling systematic evaluation and advancement of learning-based physics simulation.

Technology Category

Application Category

📝 Abstract

Simulating interactions between deformable bodies is vital in fields like material science, mechanical design, and robotics. While learning-based methods with Graph Neural Networks (GNNs) are effective at solving complex physical systems, they encounter scalability issues when modeling deformable body interactions. To model interactions between objects, pairwise global edges have to be created dynamically, which is computationally intensive and impractical for large-scale meshes. To overcome these challenges, drawing on insights from geometric representations, we propose an Adaptive Spatial Tokenization (AST) method for efficient representation of physical states. By dividing the simulation space into a grid of cells and mapping unstructured meshes onto this structured grid, our approach naturally groups adjacent mesh nodes. We then apply a cross-attention module to map the sparse cells into a compact, fixed-length embedding, serving as tokens for the entire physical state. Self-attention modules are employed to predict the next state over these tokens in latent space. This framework leverages the efficiency of tokenization and the expressive power of attention mechanisms to achieve accurate and scalable simulation results. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art approaches in modeling deformable body interactions. Notably, it remains effective on large-scale simulations with meshes exceeding 100,000 nodes, where existing methods are hindered by computational limitations. Additionally, we contribute a novel large-scale dataset encompassing a wide range of deformable body interactions to support future research in this area.

Problem

Research questions and friction points this paper is trying to address.

Scalability issues in modeling deformable body interactions

Computationally intensive pairwise global edges creation

Inefficient representation of large-scale mesh interactions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Spatial Tokenization for deformable bodies

Grid-based cross-attention for compact embeddings

Self-attention in latent space for scalability

🔎 Similar Papers

Shape-Space Deformer: Unified Visuo-Tactile Representations for Robotic Manipulation of Deformable Objects