TwinTrack: Bridging Vision and Contact Physics for Real-Time Tracking of Unknown Dynamic Objects

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-time 6-DoF pose tracking of unknown dynamic objects in frequent-contact scenarios—such as dexterous in-hand manipulation—is challenged by severe occlusion, motion blur, and transient impact forces. Method: We propose a bidirectional Real2Sim/Sim2Real framework that jointly integrates visual observations with differentiable contact-aware physical modeling. First, an initial visual reconstruction is refined online via contact dynamics optimization to estimate geometry and physical properties. Subsequently, the learned contact physics model adaptively fuses visual tracking outputs, enhancing physical consistency and robustness. Contributions/Results: The method incorporates a GPU-accelerated physics engine, a multimodal adaptive fusion architecture, online collision geometry updating, and an end-to-end differentiable visual feature extraction network. Evaluated on drop-impact and dexterous manipulation tasks, it achieves >20 Hz real-time tracking, outperforming pure vision-based and conventional filtering approaches in both accuracy and robustness.

Technology Category

Application Category

📝 Abstract
Real-time tracking of previously unseen, highly dynamic objects in contact-rich environments -- such as during dexterous in-hand manipulation -- remains a significant challenge. Purely vision-based tracking often suffers from heavy occlusions due to the frequent contact interactions and motion blur caused by abrupt motion during contact impacts. We propose TwinTrack, a physics-aware visual tracking framework that enables robust and real-time 6-DoF pose tracking of unknown dynamic objects in a contact-rich scene by leveraging the contact physics of the observed scene. At the core of TwinTrack is an integration of Real2Sim and Sim2Real. In Real2Sim, we combine the complementary strengths of vision and contact physics to estimate object's collision geometry and physical properties: object's geometry is first reconstructed from vision, then updated along with other physical parameters from contact dynamics for physical accuracy. In Sim2Real, robust pose estimation of the object is achieved by adaptive fusion between visual tracking and prediction of the learned contact physics. TwinTrack is built on a GPU-accelerated, deeply customized physics engine to ensure real-time performance. We evaluate our method on two contact-rich scenarios: object falling with rich contact impacts against the environment, and contact-rich in-hand manipulation. Experimental results demonstrate that, compared to baseline methods, TwinTrack achieves significantly more robust, accurate, and real-time 6-DoF tracking in these challenging scenarios, with tracking speed exceeding 20 Hz. Project page: https://irislab.tech/TwinTrack-webpage/
Problem

Research questions and friction points this paper is trying to address.

Real-time tracking of unseen dynamic objects in contact-rich environments
Overcoming occlusion and motion blur in vision-based tracking during contacts
Integrating vision and physics for accurate 6-DoF pose estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines vision and contact physics for tracking
Integrates Real2Sim and Sim2Real for accuracy
Uses GPU-accelerated physics engine for speed
🔎 Similar Papers
No similar papers found.
W
Wen Yang
Intelligent Robotics and Interactive Systems (IRIS) Lab, Arizona State University
Zhixian Xie
Zhixian Xie
Arizona State University
Robot LearningDexterous Manipulation
X
Xuechao Zhang
Intelligent Robotics and Interactive Systems (IRIS) Lab, Arizona State University
H
H. B. Amor
Intelligent Robotics and Interactive Systems (IRIS) Lab, Arizona State University
S
Shan Lin
Intelligent Robotics and Interactive Systems (IRIS) Lab, Arizona State University
Wanxin Jin
Wanxin Jin
Assistant Professor at Arizona State University
RoboticsControlOptimizationManipulationMachine learning