Re$^3$Sim: Generating High-Fidelity Simulation Data via 3D-Photorealistic Real-to-Sim for Robotic Manipulation

📅 2025-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the high cost of real-world data collection and the limitations of simulation-to-reality (sim-to-real) transfer due to geometric and visual discrepancies, this paper proposes a real-to-sim paradigm integrating 3D reconstruction and neural rendering to build a photorealistic real-to-simulation system. Our method unifies multi-view 3D reconstruction, Neural Radiance Fields (NeRF)-based rendering, physics simulation, and cross-view camera modeling, enabling real-time, physically consistent multi-view synthetic rendering. It achieves, for the first time, zero-shot sim-to-real transfer using only synthetic data—without any real-world fine-tuning. Evaluated on diverse robotic manipulation tasks, policies trained solely on our synthetic data attain over 58% average success rate and generalize to unseen objects. Moreover, the framework generates large-scale, high-fidelity simulation datasets. The core innovation lies in closing the loop between joint geometry–appearance modeling and physics-aware rendering, substantially reducing reliance on real-world annotations and physical interaction.

Technology Category

Application Category

📝 Abstract
Real-world data collection for robotics is costly and resource-intensive, requiring skilled operators and expensive hardware. Simulations offer a scalable alternative but often fail to achieve sim-to-real generalization due to geometric and visual gaps. To address these challenges, we propose a 3D-photorealistic real-to-sim system, namely, RE$^3$SIM, addressing geometric and visual sim-to-real gaps. RE$^3$SIM employs advanced 3D reconstruction and neural rendering techniques to faithfully recreate real-world scenarios, enabling real-time rendering of simulated cross-view cameras within a physics-based simulator. By utilizing privileged information to collect expert demonstrations efficiently in simulation, and train robot policies with imitation learning, we validate the effectiveness of the real-to-sim-to-real pipeline across various manipulation task scenarios. Notably, with only simulated data, we can achieve zero-shot sim-to-real transfer with an average success rate exceeding 58%. To push the limit of real-to-sim, we further generate a large-scale simulation dataset, demonstrating how a robust policy can be built from simulation data that generalizes across various objects. Codes and demos are available at: http://xshenhan.github.io/Re3Sim/.
Problem

Research questions and friction points this paper is trying to address.

Addressing sim-to-real generalization gaps
Generating 3D-photorealistic simulation data
Enabling zero-shot sim-to-real transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D-photorealistic real-to-sim system
Advanced 3D reconstruction techniques
Neural rendering for real-time simulation
X
Xiaoshen Han
Shanghai Jiao Tong University
Minghuan Liu
Minghuan Liu
University of Texas at Austin
RoboticsReinforcement LearningImitation Learning
Y
Yilun Chen
Shanghai AI Lab
J
Junqiu Yu
Shanghai AI Lab
Xiaoyang Lyu
Xiaoyang Lyu
The University of Hong Kong; Zhejiang University
Computer visionDepth Estimation
Y
Yang Tian
Shanghai AI Lab
B
Bolun Wang
Shanghai AI Lab
W
Weinan Zhang
Shanghai Jiao Tong University
J
Jiangmiao Pang
Shanghai AI Lab