🤖 AI Summary
To address insufficient surface reconstruction accuracy of 3D Gaussian Splatting (3DGS) in large-scale outdoor scenes—caused by high computational cost, complex dynamic appearance, and texture scarcity—this paper proposes a coarse-to-fine, image-supervised reconstruction framework. Our method introduces three key innovations: (1) adaptive sub-scene partitioning to balance efficiency and geometric detail; (2) decoupled appearance modeling with transient mask mechanisms to explicitly represent dynamic objects and occlusions; and (3) single-view regularization combined with extended multi-view constraints to mitigate optimization instability in textureless regions. Evaluated on the GauU-Scene V2 dataset, our approach significantly outperforms state-of-the-art NeRF and 3DGS baselines, achieving high-fidelity visual quality and sub-centimeter geometric accuracy in large-scale surface reconstruction. This enables robust 3D perception for aerial mapping and autonomous driving applications.
📝 Abstract
Recent developments in 3D Gaussian Splatting have made significant advances in surface reconstruction. However, scaling these methods to large-scale scenes remains challenging due to high computational demands and the complex dynamic appearances typical of outdoor environments. These challenges hinder the application in aerial surveying and autonomous driving. This paper proposes a novel solution to reconstruct large-scale surfaces with fine details, supervised by full-sized images. Firstly, we introduce a coarse-to-fine strategy to reconstruct a coarse model efficiently, followed by adaptive scene partitioning and sub-scene refining from image segments. Additionally, we integrate a decoupling appearance model to capture global appearance variations and a transient mask model to mitigate interference from moving objects. Finally, we expand the multi-view constraint and introduce a single-view regularization for texture-less areas. Our experiments were conducted on the publicly available dataset GauU-Scene V2, which was captured using unmanned aerial vehicles. To the best of our knowledge, our method outperforms existing NeRF-based and Gaussian-based methods, achieving high-fidelity visual results and accurate surface from full-size image optimization. Open-source code will be available on GitHub.