EmbodiedSplat: Personalized Real-to-Sim-to-Real Navigation with Gaussian Splats from a Mobile Device

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Embodied AI faces significant challenges in sim-to-real transfer due to fidelity gaps between synthetic simulations and real-world environments. Method: This paper proposes a lightweight, cost-effective real-scene modeling framework leveraging iPhone-captured imagery and 3D Gaussian Splatting for high-fidelity, personalized scene reconstruction. The reconstructed scenes are integrated into Habitat-Sim, where navigation policies are fine-tuned jointly on mesh-based simulation and real-image-guided navigation tasks. Contribution/Results: To our knowledge, this is the first work to establish a closed-loop “real → simulation → real” navigation adaptation pipeline. Compared to large-scale pre-trained baselines, our approach achieves absolute improvements of 20–40% in real-world navigation success rate, with simulation-to-reality behavioral correlation reaching 0.87–0.97. These results demonstrate substantially enhanced policy generalization and environmental adaptability.

Technology Category

Application Category

📝 Abstract

The field of Embodied AI predominantly relies on simulation for training and evaluation, often using either fully synthetic environments that lack photorealism or high-fidelity real-world reconstructions captured with expensive hardware. As a result, sim-to-real transfer remains a major challenge. In this paper, we introduce EmbodiedSplat, a novel approach that personalizes policy training by efficiently capturing the deployment environment and fine-tuning policies within the reconstructed scenes. Our method leverages 3D Gaussian Splatting (GS) and the Habitat-Sim simulator to bridge the gap between realistic scene capture and effective training environments. Using iPhone-captured deployment scenes, we reconstruct meshes via GS, enabling training in settings that closely approximate real-world conditions. We conduct a comprehensive analysis of training strategies, pre-training datasets, and mesh reconstruction techniques, evaluating their impact on sim-to-real predictivity in real-world scenarios. Experimental results demonstrate that agents fine-tuned with EmbodiedSplat outperform both zero-shot baselines pre-trained on large-scale real-world datasets (HM3D) and synthetically generated datasets (HSSD), achieving absolute success rate improvements of 20% and 40% on real-world Image Navigation task. Moreover, our approach yields a high sim-vs-real correlation (0.87--0.97) for the reconstructed meshes, underscoring its effectiveness in adapting policies to diverse environments with minimal effort. Project page: https://gchhablani.github.io/embodied-splat

Problem

Research questions and friction points this paper is trying to address.

Bridging simulation-reality gap for embodied AI navigation training

Personalizing policy training using mobile device captured environments

Improving sim-to-real transfer with efficient 3D scene reconstruction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses iPhone-captured scenes for environment reconstruction

Leverages 3D Gaussian Splatting to create realistic meshes

Fine-tunes navigation policies within the reconstructed simulation

🔎 Similar Papers

Augmented Reality without Borders: Achieving Precise Localization Without Maps