$D^2GS$: Dense Depth Regularization for LiDAR-free Urban Scene Reconstruction

📅 2025-10-29

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Existing lidar-dependent 3D urban reconstruction methods suffer from challenging spatiotemporal calibration, spatial misalignment, and reprojection errors. To address these issues, this work proposes the first fully lidar-free Gaussian splatting framework for high-fidelity urban scene reconstruction. Our method initializes a point cloud via multi-view depth prediction and inverse projection, introduces a learnable dense depth enhancement module, and incorporates diffusion model priors to guide depth refinement. Furthermore, we jointly enforce geometric consistency and ground-structure accuracy through progressive Gaussian pruning and road-region-specific anisotropic Gaussian shape constraints. Experiments on the Waymo dataset demonstrate that our approach significantly outperforms existing lidar-free methods and achieves geometric reconstruction quality competitive with—sometimes surpassing—that of state-of-the-art methods trained with real lidar supervision. To our knowledge, this is the first work to achieve high-quality, purely vision-driven Gaussian splatting reconstruction of complex urban scenes.

Technology Category

Application Category

📝 Abstract

Recently, Gaussian Splatting (GS) has shown great potential for urban scene reconstruction in the field of autonomous driving. However, current urban scene reconstruction methods often depend on multimodal sensors as inputs, extit{i.e.} LiDAR and images. Though the geometry prior provided by LiDAR point clouds can largely mitigate ill-posedness in reconstruction, acquiring such accurate LiDAR data is still challenging in practice: i) precise spatiotemporal calibration between LiDAR and other sensors is required, as they may not capture data simultaneously; ii) reprojection errors arise from spatial misalignment when LiDAR and cameras are mounted at different locations. To avoid the difficulty of acquiring accurate LiDAR depth, we propose $D^2GS$, a LiDAR-free urban scene reconstruction framework. In this work, we obtain geometry priors that are as effective as LiDAR while being denser and more accurate. $ extbf{First}$, we initialize a dense point cloud by back-projecting multi-view metric depth predictions. This point cloud is then optimized by a Progressive Pruning strategy to improve the global consistency. $ extbf{Second}$, we jointly refine Gaussian geometry and predicted dense metric depth via a Depth Enhancer. Specifically, we leverage diffusion priors from a depth foundation model to enhance the depth maps rendered by Gaussians. In turn, the enhanced depths provide stronger geometric constraints during Gaussian training. $ extbf{Finally}$, we improve the accuracy of ground geometry by constraining the shape and normal attributes of Gaussians within road regions. Extensive experiments on the Waymo dataset demonstrate that our method consistently outperforms state-of-the-art methods, producing more accurate geometry even when compared with those using ground-truth LiDAR data.

Problem

Research questions and friction points this paper is trying to address.

Eliminating LiDAR dependency for urban scene reconstruction in autonomous driving

Addressing sensor calibration and spatial misalignment issues in multimodal systems

Improving geometric accuracy without requiring expensive LiDAR data acquisition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multi-view depth predictions to initialize dense point cloud

Refines Gaussian geometry with diffusion-based depth enhancement

Improves ground geometry by constraining road region Gaussians

🔎 Similar Papers

No similar papers found.