MotionPhysics: Learnable Motion Distillation for Text-Guided Simulation

📅 2026-01-01
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the first end-to-end differentiable framework capable of generating physically plausible dynamics in 3D scenes directly from text prompts, circumventing the need for expert knowledge or laborious parameter tuning characteristic of traditional physics simulation. By leveraging a video diffusion model to extract motion priors and introducing a learnable motion distillation loss, the method effectively decouples motion from appearance and geometric discrepancies. Notably, it achieves purely text-driven generation of physically consistent dynamics without requiring ground-truth trajectories or annotated videos. Evaluated across more than 30 diverse scenarios—including elastic solids, metals, foams, granular materials, and both Newtonian and non-Newtonian fluids—the approach significantly outperforms existing methods in producing realistic and physically coherent motion.

Technology Category

Application Category

📝 Abstract
Accurately simulating existing 3D objects and a wide variety of materials often demands expert knowledge and time-consuming physical parameter tuning to achieve the desired dynamic behavior. We introduce MotionPhysics, an end-to-end differentiable framework that infers plausible physical parameters from a user-provided natural language prompt for a chosen 3D scene of interest, removing the need for guidance from ground-truth trajectories or annotated videos. Our approach first utilizes a multimodal large language model to estimate material parameter values, which are constrained to lie within plausible ranges. We further propose a learnable motion distillation loss that extracts robust motion priors from pretrained video diffusion models while minimizing appearance and geometry inductive biases to guide the simulation. We evaluate MotionPhysics across more than thirty scenarios, including real-world, human-designed, and AI-generated 3D objects, spanning a wide range of materials such as elastic solids, metals, foams, sand, and both Newtonian and non-Newtonian fluids. We demonstrate that MotionPhysics produces visually realistic dynamic simulations guided by natural language, surpassing the state of the art while automatically determining physically plausible parameters. The code and project page are available at: https://wangmiaowei.github.io/MotionPhysics.github.io/.
Problem

Research questions and friction points this paper is trying to address.

text-guided simulation
physical parameter estimation
3D object dynamics
natural language prompting
motion simulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

learnable motion distillation
text-guided simulation
differentiable physics
multimodal LLM
video diffusion priors
🔎 Similar Papers
No similar papers found.
M
Miaowei Wang
School of Informatics, The University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
J
Jakub Zadro.zny
School of Informatics, The University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
Oisin Mac Aodha
Oisin Mac Aodha
Reader (Associate Professor), University of Edinburgh
Computer VisionMachine LearningMachine TeachingActive LearningConservation Technology
Amir Vaxman
Amir Vaxman
The University of Edinburgh
Digital Geometry ProcessingShape ScienceComputer Graphics