PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis

๐Ÿ“… 2025-10-24
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Modeling deformable-object dynamics under scarce real-world video data remains challenging due to poor generalizability and fidelity. Method: We propose a physics-aware digital twinโ€“driven world model framework. Leveraging an MPM simulator, we jointly optimize constitutive model selection, global-local material properties, and part-aware perturbation strategies to generate high-fidelity synthetic data. We further design a lightweight graph neural network (GNN) world model explicitly embedding physical attributes and support fine-tuning via real-video feedback. Contribution/Results: Our approach achieves, for the first time, strong generalization and efficiency with only minimal real-video supervision. It delivers high-accuracy future-frame prediction across diverse deformable objects, attains 47ร— faster inference than PhysTwin, and maintains robust performance on unseen interaction scenarios.

Technology Category

Application Category

๐Ÿ“ Abstract
Interactive world models that simulate object dynamics are crucial for robotics, VR, and AR. However, it remains a significant challenge to learn physics-consistent dynamics models from limited real-world video data, especially for deformable objects with spatially-varying physical properties. To overcome the challenge of data scarcity, we propose PhysWorld, a novel framework that utilizes a simulator to synthesize physically plausible and diverse demonstrations to learn efficient world models. Specifically, we first construct a physics-consistent digital twin within MPM simulator via constitutive model selection and global-to-local optimization of physical properties. Subsequently, we apply part-aware perturbations to the physical properties and generate various motion patterns for the digital twin, synthesizing extensive and diverse demonstrations. Finally, using these demonstrations, we train a lightweight GNN-based world model that is embedded with physical properties. The real video can be used to further refine the physical properties. PhysWorld achieves accurate and fast future predictions for various deformable objects, and also generalizes well to novel interactions. Experiments show that PhysWorld has competitive performance while enabling inference speeds 47 times faster than the recent state-of-the-art method, i.e., PhysTwin.
Problem

Research questions and friction points this paper is trying to address.

Learning physics-consistent dynamics models from limited video data
Handling deformable objects with spatially-varying physical properties
Overcoming data scarcity for efficient world model training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses simulator to synthesize physics-aware demonstration data
Constructs digital twin via constitutive model optimization
Trains lightweight GNN model with embedded physical properties
๐Ÿ”Ž Similar Papers
No similar papers found.