NavGSim: High-Fidelity Gaussian Splatting Simulator for Large-Scale Navigation

📅 2026-03-16

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

This work proposes NavGSim, a navigation simulator based on hierarchical 3D Gaussian splatting, to address the lack of high-fidelity, large-scale simulation environments for robotic navigation tasks. For the first time, 3D Gaussian splatting is extended to floor-scale scenes spanning hundreds of square meters, enabling photorealistic rendering and physical interaction. The method introduces a novel Gaussian slicing technique to directly extract navigable regions and collision information from reconstructed geometry. NavGSim further integrates multi-GPU parallel rendering and a complete API, supporting customizable scenes and end-to-end training of Vision-Language-Action (VLA) policies. Experiments demonstrate that models trained with NavGSim significantly improve their ability to understand and execute diverse navigation instructions in both simulated and real-world environments.

Technology Category

Application Category

📝 Abstract

Simulating realistic environments for robots is widely recognized as a critical challenge in robot learning, particularly in terms of rendering and physical simulation. This challenge becomes even more pronounced in navigation tasks, where trajectories often extend across multiple rooms or entire floors. In this work, we present NavGSim, a Gaussian Splatting-based simulator designed to generate high-fidelity, large-scale navigation environments. Built upon a hierarchical 3D Gaussian Splatting framework, NavGSim enables photorealistic rendering in expansive scenes spanning hundreds of square meters. To simulate navigation collisions, we introduce a Gaussian Splatting-based slice technique that directly extracts navigable areas from reconstructed Gaussians. Additionally, for ease of use, we provide comprehensive NavGSim APIs supporting multi-GPU development, including tools for custom scene reconstruction, robot configuration, policy training, and evaluation. To evaluate NavGSim's effectiveness, we train a Vision-Language-Action (VLA) model using trajectories collected from NavGSim and assess its performance in both simulated and real-world environments. Our results demonstrate that NavGSim significantly enhances the VLA model's scene understanding, enabling the policy to handle diverse navigation queries effectively.

Problem

Research questions and friction points this paper is trying to address.

robot navigation

realistic simulation

large-scale environments

photorealistic rendering

collision simulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Splatting

large-scale navigation simulation

photorealistic rendering