GLS: Geometry-aware 3D Language Gaussian Splatting

📅 2024-11-27
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the insufficient geometric sharpness and cross-view inconsistency in 3D Gaussian Splatting (3DGS) reconstruction and open-vocabulary segmentation for indoor scenes. We propose GLS, the first unified framework jointly optimizing both tasks via geometry-aware modeling. Specifically, GLS incorporates surface normal priors to regularize depth estimation—enhancing geometric fidelity—and tightly couples CLIP-based semantic embeddings with DEVA video instance masks to enforce cross-view semantic consistency. Crucially, GLS requires no additional annotations and performs end-to-end optimization of both reconstruction geometry and open-vocabulary segmentation. Evaluated on MuSHRoom, ScanNet++, and LERF-OVS benchmarks, GLS surpasses all task-specific state-of-the-art methods, achieving significant gains in reconstruction quality (PSNR/SSIM) and open-vocabulary segmentation mIoU. These results empirically validate the effectiveness of geometric-semantic co-modeling for joint 3D reconstruction and semantic understanding.

Technology Category

Application Category

📝 Abstract
Recently, 3D Gaussian Splatting (3DGS) has achieved significant performance on indoor surface reconstruction and open-vocabulary segmentation. This paper presents GLS, a unified framework of surface reconstruction and open-vocabulary segmentation based on 3DGS. GLS extends two fields by exploring the correlation between them. For indoor surface reconstruction, we introduce surface normal prior as a geometric cue to guide the rendered normal, and use the normal error to optimize the rendered depth. For open-vocabulary segmentation, we employ 2D CLIP features to guide instance features and utilize DEVA masks to enhance their view consistency. Extensive experiments demonstrate the effectiveness of jointly optimizing surface reconstruction and open-vocabulary segmentation, where GLS surpasses state-of-the-art approaches of each task on MuSHRoom, ScanNet++, and LERF-OVS datasets. Code will be available at https://github.com/JiaxiongQ/GLS.
Problem

Research questions and friction points this paper is trying to address.

Improving sharpness and smoothness in 3D surface reconstruction
Enhancing 3D open-vocabulary segmentation with CLIP features
Jointly optimizing reconstruction and segmentation using 3DGS
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometry-aware 3D Gaussian Splatting for reconstruction
Surface normal prior enhances depth optimization
CLIP features guide segmentation with DEVA consistency
🔎 Similar Papers
2024-01-08arXiv.orgCitations: 127