TwinAligner: Visual-Dynamic Alignment Empowers Physics-aware Real2Sim2Real for Robotic Manipulation

πŸ“… 2025-12-22
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
End-to-end robotic policy learning is hindered by high real-world data acquisition costs and substantial simulation-to-reality (Sim2Real) discrepancies. To address this, we propose Real2Sim2Realβ€”a closed-loop framework featuring a novel vision-dynamics dual-alignment mechanism. For vision alignment, we employ signed distance function (SDF)-based implicit reconstruction coupled with editable 3D Gaussian Splatting (3DGS) to achieve pixel-level fidelity. For dynamics alignment, we model robot-object interactions to identify rigid-body physical constraints, ensuring dynamical consistency across domains. Our cross-domain policy iteration framework enables zero-shot transfer. Experiments demonstrate that policies trained solely in simulation generalize efficiently to real robotic arms: both vision and dynamics alignment metrics achieve state-of-the-art performance; real-world and simulated policy behaviors exhibit strong consistency; and the Sim2Real performance gap is significantly reduced.

Technology Category

Application Category

πŸ“ Abstract
The robotics field is evolving towards data-driven, end-to-end learning, inspired by multimodal large models. However, reliance on expensive real-world data limits progress. Simulators offer cost-effective alternatives, but the gap between simulation and reality challenges effective policy transfer. This paper introduces TwinAligner, a novel Real2Sim2Real system that addresses both visual and dynamic gaps. The visual alignment module achieves pixel-level alignment through SDF reconstruction and editable 3DGS rendering, while the dynamic alignment module ensures dynamic consistency by identifying rigid physics from robot-object interaction. TwinAligner improves robot learning by providing scalable data collection and establishing a trustworthy iterative cycle, accelerating algorithm development. Quantitative evaluations highlight TwinAligner's strong capabilities in visual and dynamic real-to-sim alignment. This system enables policies trained in simulation to achieve strong zero-shot generalization to the real world. The high consistency between real-world and simulated policy performance underscores TwinAligner's potential to advance scalable robot learning. Code and data will be released on https://twin-aligner.github.io
Problem

Research questions and friction points this paper is trying to address.

Addresses visual and dynamic gaps in simulation-to-reality transfer for robotics
Enables scalable data collection and trustworthy iterative cycle for robot learning
Achieves zero-shot generalization of policies from simulation to real world
Innovation

Methods, ideas, or system contributions that make the work stand out.

Visual alignment via SDF reconstruction and editable 3DGS rendering
Dynamic alignment by identifying rigid physics from robot-object interaction
Real2Sim2Real system enabling zero-shot policy transfer to real world
Hongwei Fan
Hongwei Fan
Peking University
Robotics3D Vision
Hang Dai
Hang Dai
Professor, Wuhan University. University of Glasgow
Deep LearningAutonomous DrivingMedical Image Analysis
Jiyao Zhang
Jiyao Zhang
Peking University
Embodied AIRobotics3D Vision
Jinzhou Li
Jinzhou Li
Duke University
RoboticsDeep Reinforcement LearningManipulation
Q
Qiyang Yan
CFCS, School of Computer Science, Peking University; PKU-AgiBot Lab
Y
Yujie Zhao
CFCS, School of Computer Science, Peking University; PKU-AgiBot Lab
Mingju Gao
Mingju Gao
Unknown affiliation
Computer VisionRobotics
J
Jinghang Wu
CFCS, School of Computer Science, Peking University; PKU-AgiBot Lab
H
Hao Tang
State Key Laboratory of Multimedia Information Processing, School of Computer Science, Peking University
H
Hao Dong
CFCS, School of Computer Science, Peking University; PKU-AgiBot Lab