ContactGaussian-WM: Learning Physics-Grounded World Model from Videos

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods struggle to accurately model physical environments in data-sparse, contact-rich complex dynamic scenes. This work proposes a differentiable, physics-driven rigid-body world model that, for the first time, employs a unified Gaussian representation to jointly model visual appearance and collision geometry. Integrated with an end-to-end differentiable physics engine, the model enables learning complex physical dynamics directly from sparse video sequences. The approach supports inference of physical properties and demonstrates superior performance over existing methods in both simulated and real-world scenarios, exhibiting strong generalization capabilities. Furthermore, it has been successfully applied to synthetic data generation and real-time model-predictive control.

Technology Category

Application Category

📝 Abstract
Developing world models that understand complex physical interactions is essential for advancing robotic planning and simulation.However, existing methods often struggle to accurately model the environment under conditions of data scarcity and complex contact-rich dynamic motion.To address these challenges, we propose ContactGaussian-WM, a differentiable physics-grounded rigid-body world model capable of learning intricate physical laws directly from sparse and contact-rich video sequences.Our framework consists of two core components: (1) a unified Gaussian representation for both visual appearance and collision geometry, and (2) an end-to-end differentiable learning framework that differentiates through a closed-form physics engine to infer physical properties from sparse visual observations.Extensive simulations and real-world evaluations demonstrate that ContactGaussian-WM outperforms state-of-the-art methods in learning complex scenarios, exhibiting robust generalization capabilities.Furthermore, we showcase the practical utility of our framework in downstream applications, including data synthesis and real-time MPC.
Problem

Research questions and friction points this paper is trying to address.

world model
physics-grounded
contact-rich dynamics
data scarcity
rigid-body simulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian representation
differentiable physics
world model
contact-rich dynamics
sparse video learning
🔎 Similar Papers
No similar papers found.
M
Meizhong Wang
Department of Control Science and Engineering, College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China; Shanghai Institute of Intelligent Science and Technology, National Key Laboratory of Autonomous Intelligent Unmanned Systems, Beijing 100816, China; Frontiers Science Center for Intelligent Autonomous Systems, Ministry of Education, Beijing 100816, China
Wanxin Jin
Wanxin Jin
Assistant Professor at Arizona State University
RoboticsControlOptimizationManipulationMachine learning
Kun Cao
Kun Cao
Tongji University
multi-robot systemsinverse RLsoft robotics
Lihua Xie
Lihua Xie
Professor of Electrical Engineering, Nanyang Technological University
Robust controlNetworked ControlMult-agent Systems
Yiguang Hong
Yiguang Hong
Institute of Systems Science, Chinese Academy of Sciences
Multi-agent systemsdistributed optimization/gamenonlinear dynamics and controlmachine learningautomata