DriveSplat: Decoupled Driving Scene Reconstruction with Geometry-enhanced Partitioned Neural Gaussians

📅 2025-08-21

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

To address motion blur in dynamic objects and inaccurate geometric modeling of large-scale static backgrounds in autonomous driving scenarios, this paper proposes a dynamic-static decoupled neural Gaussian representation. The method introduces three key innovations: (1) a region-wise voxel initialization strategy to enhance geometric priors for large-scale scenes; (2) deformable Gaussian modeling jointly supervised by depth and surface normal estimates to optimize both dynamic object deformation and static background geometry; and (3) integration of geometric priors from pre-trained models to enable end-to-end joint optimization. Evaluated on Waymo and KITTI benchmarks, the approach significantly mitigates motion blur and geometric distortion, achieving state-of-the-art performance in both novel-view synthesis quality and 3D geometric accuracy.

Technology Category

Application Category

📝 Abstract

In the realm of driving scenarios, the presence of rapidly moving vehicles, pedestrians in motion, and large-scale static backgrounds poses significant challenges for 3D scene reconstruction. Recent methods based on 3D Gaussian Splatting address the motion blur problem by decoupling dynamic and static components within the scene. However, these decoupling strategies overlook background optimization with adequate geometry relationships and rely solely on fitting each training view by adding Gaussians. Therefore, these models exhibit limited robustness in rendering novel views and lack an accurate geometric representation. To address the above issues, we introduce DriveSplat, a high-quality reconstruction method for driving scenarios based on neural Gaussian representations with dynamic-static decoupling. To better accommodate the predominantly linear motion patterns of driving viewpoints, a region-wise voxel initialization scheme is employed, which partitions the scene into near, middle, and far regions to enhance close-range detail representation. Deformable neural Gaussians are introduced to model non-rigid dynamic actors, whose parameters are temporally adjusted by a learnable deformation network. The entire framework is further supervised by depth and normal priors from pre-trained models, improving the accuracy of geometric structures. Our method has been rigorously evaluated on the Waymo and KITTI datasets, demonstrating state-of-the-art performance in novel-view synthesis for driving scenarios.

Problem

Research questions and friction points this paper is trying to address.

Reconstructing driving scenes with moving objects and static backgrounds

Addressing limited robustness in novel view rendering

Improving geometric accuracy in dynamic-static decoupled reconstruction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Region-wise voxel initialization for detail

Deformable neural Gaussians for dynamic actors

Depth-normal priors supervision for geometry accuracy

🔎 Similar Papers

OmniRe: Omni Urban Scene Reconstruction