Wanderland: Geometrically Grounded Simulation for Open-World Embodied AI

📅 2025-11-25

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Embodied AI evaluation—particularly for visual navigation—in open urban environments suffers from limited reproducibility due to insufficient simulation fidelity: existing methods fail to simultaneously achieve high-fidelity sensor rendering and geometrically accurate interactions, while emerging paradigms (e.g., video-to-3D Gaussian Splatting) still exhibit substantial visual–geometric realism gaps. Method: We propose the first real-to-sim framework integrating multi-sensor data acquisition, collaborative NeRF and 3D Gaussian Splatting (3DGS) reconstruction, and geometry-constrained novel view synthesis to enable high-precision geometric modeling and photorealistic perception simulation for complex indoor–outdoor urban scenes. Contribution/Results: We construct a diverse urban-scene dataset and empirically demonstrate that geometric accuracy critically determines novel view synthesis quality and navigation policy generalizability. Our framework substantially narrows the sim2real gap, enabling joint benchmarking of navigation, view synthesis, and 3D reconstruction—thereby enhancing evaluation credibility and reproducibility.

Technology Category

Application Category

📝 Abstract

Reproducible closed-loop evaluation remains a major bottleneck in Embodied AI such as visual navigation. A promising path forward is high-fidelity simulation that combines photorealistic sensor rendering with geometrically grounded interaction in complex, open-world urban environments. Although recent video-3DGS methods ease open-world scene capturing, they are still unsuitable for benchmarking due to large visual and geometric sim-to-real gaps. To address these challenges, we introduce Wanderland, a real-to-sim framework that features multi-sensor capture, reliable reconstruction, accurate geometry, and robust view synthesis. Using this pipeline, we curate a diverse dataset of indoor-outdoor urban scenes and systematically demonstrate how image-only pipelines scale poorly, how geometry quality impacts novel view synthesis, and how all of these adversely affect navigation policy learning and evaluation reliability. Beyond serving as a trusted testbed for embodied navigation, Wanderland's rich raw sensor data further allows benchmarking of 3D reconstruction and novel view synthesis models. Our work establishes a new foundation for reproducible research in open-world embodied AI. Project website is at https://ai4ce.github.io/wanderland/.

Problem

Research questions and friction points this paper is trying to address.

Addresses unreliable embodied AI evaluation through geometrically accurate simulation

Reduces sim-to-real gaps in visual navigation using real-to-sim framework

Enables reproducible benchmarking for open-world 3D reconstruction and navigation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Real-to-sim framework with multi-sensor capture

Reliable reconstruction and accurate geometry

Robust view synthesis for embodied AI

🔎 Similar Papers

Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI