Sparse View Distractor-Free Gaussian Splatting

📅 2026-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the significant performance degradation of existing disturbance-free 3D Gaussian splatting methods under sparse-view conditions, which stems from their reliance on unreliable color residual heuristics. To overcome this limitation, we propose a novel approach that integrates priors from a geometric foundation model (VGGT) and a vision-language model (VLM). Specifically, we leverage VGGT’s attention maps for the first time to enable semantic entity matching, while employing the VLM to identify large-scale static regions, thereby effectively suppressing transient disturbances. Our method substantially enhances the robustness and accuracy of disturbance-free 3D reconstruction from sparse inputs. Extensive experiments demonstrate its superior performance compared to existing approaches.

Technology Category

Application Category

📝 Abstract
3D Gaussian Splatting (3DGS) enables efficient training and fast novel view synthesis in static environments. To address challenges posed by transient objects, distractor-free 3DGS methods have emerged and shown promising results when dense image captures are available. However, their performance degrades significantly under sparse input conditions. This limitation primarily stems from the reliance on the color residual heuristics to guide the training, which becomes unreliable with limited observations. In this work, we propose a framework to enhance distractor-free 3DGS under sparse-view conditions by incorporating rich prior information. Specifically, we first adopt the geometry foundation model VGGT to estimate camera parameters and generate a dense set of initial 3D points. Then, we harness the attention maps from VGGT for efficient and accurate semantic entity matching. Additionally, we utilize Vision-Language Models (VLMs) to further identify and preserve the large static regions in the scene. We also demonstrate how these priors can be seamlessly integrated into existing distractor-free 3DGS methods. Extensive experiments confirm the effectiveness and robustness of our approach in mitigating transient distractors for sparse-view 3DGS training.
Problem

Research questions and friction points this paper is trying to address.

Sparse View
Distractor-Free
3D Gaussian Splatting
Transient Objects
Novel View Synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse View
Distractor-Free 3DGS
Geometry Foundation Model
Vision-Language Model
Semantic Entity Matching
🔎 Similar Papers
No similar papers found.
Yi Gu
Yi Gu
Nara Institute of Science and Technology
Z
Zhaorui Wang
The Hong Kong University of Science and Technology (Guangzhou)
Jiahang Cao
Jiahang Cao
The University of Hong Kong
Robot LearningGenerative ModelsCognitive-inspired Models
J
Jiaxu Wang
The Hong Kong University of Science and Technology (Guangzhou)
M
Mingle Zhao
University of Macau
D
Dongjun Ye
The Hong Kong University of Science and Technology (Guangzhou)
Renjing Xu
Renjing Xu
HKUST(GZ)
Brain-inspired ComputingHumanoid Computing