On-the-fly Large-scale 3D Reconstruction from Multi-Camera Rigs

📅 2025-12-09

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

To address the limited field-of-view in monocular RGB streams—which hinders complete scene coverage in 3D Gaussian Splatting (3DGS)—this paper introduces the first real-time, large-scale 3D reconstruction framework tailored for multi-camera devices. Methodologically, it employs hierarchical camera initialization and a lightweight multi-camera bundle adjustment to achieve calibration-free, drift-free online trajectory estimation; further, it proposes a redundancy-free Gaussian sampling strategy and a frequency-aware optimization scheduler, enabling efficient multi-view data fusion via a unified Gaussian representation for online registration and mapping. Experiments demonstrate that, given only raw multi-camera video streams, the system reconstructs high-fidelity scenes spanning hundreds of meters within two minutes. It achieves state-of-the-art performance in reconstruction speed, robustness to motion and lighting variations, and geometric fidelity.

Technology Category

Application Category

📝 Abstract

Recent advances in 3D Gaussian Splatting (3DGS) have enabled efficient free-viewpoint rendering and photorealistic scene reconstruction. While on-the-fly extensions of 3DGS have shown promise for real-time reconstruction from monocular RGB streams, they often fail to achieve complete 3D coverage due to the limited field of view (FOV). Employing a multi-camera rig fundamentally addresses this limitation. In this paper, we present the first on-the-fly 3D reconstruction framework for multi-camera rigs. Our method incrementally fuses dense RGB streams from multiple overlapping cameras into a unified Gaussian representation, achieving drift-free trajectory estimation and efficient online reconstruction. We propose a hierarchical camera initialization scheme that enables coarse inter-camera alignment without calibration, followed by a lightweight multi-camera bundle adjustment that stabilizes trajectories while maintaining real-time performance. Furthermore, we introduce a redundancy-free Gaussian sampling strategy and a frequency-aware optimization scheduler to reduce the number of Gaussian primitives and the required optimization iterations, thereby maintaining both efficiency and reconstruction fidelity. Our method reconstructs hundreds of meters of 3D scenes within just 2 minutes using only raw multi-camera video streams, demonstrating unprecedented speed, robustness, and Fidelity for on-the-fly 3D scene reconstruction.

Problem

Research questions and friction points this paper is trying to address.

Real-time 3D reconstruction from multi-camera video streams

Achieving complete 3D coverage without prior calibration

Maintaining efficiency and fidelity in online scene reconstruction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Incremental fusion of multi-camera RGB streams into unified Gaussian representation

Hierarchical camera initialization and lightweight multi-camera bundle adjustment

Redundancy-free Gaussian sampling and frequency-aware optimization scheduler

🔎 Similar Papers

Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View