🤖 AI Summary
Addressing the high cost of real-world data collection and the limitations of simulation-to-reality (sim-to-real) transfer due to geometric and visual discrepancies, this paper proposes a real-to-sim paradigm integrating 3D reconstruction and neural rendering to build a photorealistic real-to-simulation system. Our method unifies multi-view 3D reconstruction, Neural Radiance Fields (NeRF)-based rendering, physics simulation, and cross-view camera modeling, enabling real-time, physically consistent multi-view synthetic rendering. It achieves, for the first time, zero-shot sim-to-real transfer using only synthetic data—without any real-world fine-tuning. Evaluated on diverse robotic manipulation tasks, policies trained solely on our synthetic data attain over 58% average success rate and generalize to unseen objects. Moreover, the framework generates large-scale, high-fidelity simulation datasets. The core innovation lies in closing the loop between joint geometry–appearance modeling and physics-aware rendering, substantially reducing reliance on real-world annotations and physical interaction.
📝 Abstract
Real-world data collection for robotics is costly and resource-intensive, requiring skilled operators and expensive hardware. Simulations offer a scalable alternative but often fail to achieve sim-to-real generalization due to geometric and visual gaps. To address these challenges, we propose a 3D-photorealistic real-to-sim system, namely, RE$^3$SIM, addressing geometric and visual sim-to-real gaps. RE$^3$SIM employs advanced 3D reconstruction and neural rendering techniques to faithfully recreate real-world scenarios, enabling real-time rendering of simulated cross-view cameras within a physics-based simulator. By utilizing privileged information to collect expert demonstrations efficiently in simulation, and train robot policies with imitation learning, we validate the effectiveness of the real-to-sim-to-real pipeline across various manipulation task scenarios. Notably, with only simulated data, we can achieve zero-shot sim-to-real transfer with an average success rate exceeding 58%. To push the limit of real-to-sim, we further generate a large-scale simulation dataset, demonstrating how a robust policy can be built from simulation data that generalizes across various objects. Codes and demos are available at: http://xshenhan.github.io/Re3Sim/.