S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction

📅 2025-03-11

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

To address the prohibitive computational cost of 3D Gaussian Splatting (3DGS) in large-scale street-scene reconstruction—where complexity scales superlinearly with scene size—and its reliance on hard-to-obtain ground-truth 3D bounding boxes for static-dynamic object separation, this paper proposes a lightweight street-scene 3DGS reconstruction framework. Methodologically: (1) it eliminates 3D bounding boxes and introduces adaptive static-dynamic decoupling based solely on readily available 2D detection boxes; (2) it employs joint local-global transformation optimization to minimize redundant geometric transformations; and (3) it designs an efficient long-range rendering culling strategy. Evaluated on the Argoverse2 video dataset, our method achieves state-of-the-art PSNR and SSIM scores while reducing reconstruction time to 20–50% of mainstream approaches. This substantial acceleration significantly enhances practical deployability in real-world settings and improves scalability to large-scale urban environments.

Technology Category

Application Category

📝 Abstract

Recently, 3D Gaussian Splatting (3DGS) has reshaped the field of photorealistic 3D reconstruction, achieving impressive rendering quality and speed. However, when applied to large-scale street scenes, existing methods suffer from rapidly escalating per-viewpoint reconstruction costs as scene size increases, leading to significant computational overhead. After revisiting the conventional pipeline, we identify three key factors accounting for this issue: unnecessary local-to-global transformations, excessive 3D-to-2D projections, and inefficient rendering of distant content. To address these challenges, we propose S3R-GS, a 3DGS framework that Streamlines the pipeline for large-scale Street Scene Reconstruction, effectively mitigating these limitations. Moreover, most existing street 3DGS methods rely on ground-truth 3D bounding boxes to separate dynamic and static components, but 3D bounding boxes are difficult to obtain, limiting real-world applicability. To address this, we propose an alternative solution with 2D boxes, which are easier to annotate or can be predicted by off-the-shelf vision foundation models. Such designs together make S3R-GS readily adapt to large, in-the-wild scenarios. Extensive experiments demonstrate that S3R-GS enhances rendering quality and significantly accelerates reconstruction. Remarkably, when applied to videos from the challenging Argoverse2 dataset, it achieves state-of-the-art PSNR and SSIM, reducing reconstruction time to below 50%--and even 20%--of competing methods.

Problem

Research questions and friction points this paper is trying to address.

High computational cost in large-scale street scene reconstruction.

Dependency on hard-to-obtain 3D bounding boxes for dynamic-static separation.

Inefficient rendering and transformation processes in existing methods.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Streamlines 3DGS for large-scale street scenes

Replaces 3D with 2D boxes for dynamic-static separation

Reduces reconstruction time by over 50%

🔎 Similar Papers

Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View