GaussRender: Learning 3D Occupancy with Gaussian Rendering

📅 2025-02-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D occupancy prediction methods treat voxels as independent units, neglecting spatial dependencies—leading to geometric-semantic inconsistencies that compromise autonomous driving safety. To address this, we propose a differentiable 3D-to-2D reprojection loss based on Gaussian splatting, the first to employ Gaussian splatting as a differentiable rendering proxy for voxels without modifying network architectures. This enables view-aware cross-dimensional supervision and explicit multi-view geometric consistency modeling. As a plug-and-play loss, it effectively enforces inter-voxel spatial constraints. Our method consistently improves performance across state-of-the-art models—including SurroundOcc, TPVFormer, and Symphonies—on nuScenes (SurroundOcc-nuScenes and Occ3D-nuScenes) and KITTI-360 (SSCBench), enhancing occlusion robustness and geometric-semantic alignment.

Technology Category

Application Category

📝 Abstract
Understanding the 3D geometry and semantics of driving scenes is critical for developing of safe autonomous vehicles. While 3D occupancy models are typically trained using voxel-based supervision with standard losses (e.g., cross-entropy, Lovasz, dice), these approaches treat voxel predictions independently, neglecting their spatial relationships. In this paper, we propose GaussRender, a plug-and-play 3D-to-2D reprojection loss that enhances voxel-based supervision. Our method projects 3D voxel representations into arbitrary 2D perspectives and leverages Gaussian splatting as an efficient, differentiable rendering proxy of voxels, introducing spatial dependencies across projected elements. This approach improves semantic and geometric consistency, handles occlusions more efficiently, and requires no architectural modifications. Extensive experiments on multiple benchmarks (SurroundOcc-nuScenes, Occ3D-nuScenes, SSCBench-KITTI360) demonstrate consistent performance gains across various 3D occupancy models (TPVFormer, SurroundOcc, Symphonies), highlighting the robustness and versatility of our framework. The code is available at https://github.com/valeoai/GaussRender.
Problem

Research questions and friction points this paper is trying to address.

Enhances 3D occupancy model training.
Improves semantic and geometric consistency.
Handles occlusions efficiently without architectural changes.
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D-to-2D reprojection loss
Gaussian splatting rendering
enhances voxel-based supervision
🔎 Similar Papers
No similar papers found.