Segment Any 3D Gaussians

📅 2023-12-01
🏛️ arXiv.org
📈 Citations: 49
Influential: 4
📄 PDF
🤖 AI Summary
This work addresses the lack of efficient promptable 3D segmentation methods for 3D Gaussian Splatting (3D-GS) representations. To this end, we propose the first real-time, promptable 3D segmentation framework. Methodologically, we introduce a scale-gated affinity feature embedding and a soft-scale gating mechanism; leverage knowledge distillation to transfer SAM’s 2D visual prompting capability into the 3D Gaussian space; and incorporate a scale-aware contrastive learning strategy to explicitly model multi-granularity segmentation ambiguity. Experiments demonstrate that our method achieves state-of-the-art accuracy while responding to 2D point or bounding-box prompts and completing high-fidelity 3D object segmentation in only 4 ms. This is the first work to enable millisecond-level promptable segmentation under the 3D-GS representation, establishing a new paradigm for neural rendering and interactive 3D understanding.
📝 Abstract
This paper presents SAGA (Segment Any 3D GAussians), a highly efficient 3D promptable segmentation method based on 3D Gaussian Splatting (3D-GS). Given 2D visual prompts as input, SAGA can segment the corresponding 3D target represented by 3D Gaussians within 4 ms. This is achieved by attaching an scale-gated affinity feature to each 3D Gaussian to endow it a new property towards multi-granularity segmentation. Specifically, a scale-aware contrastive training strategy is proposed for the scale-gated affinity feature learning. It 1) distills the segmentation capability of the Segment Anything Model (SAM) from 2D masks into the affinity features and 2) employs a soft scale gate mechanism to deal with multi-granularity ambiguity in 3D segmentation through adjusting the magnitude of each feature channel according to a specified 3D physical scale. Evaluations demonstrate that SAGA achieves real-time multi-granularity segmentation with quality comparable to state-of-the-art methods. As one of the first methods addressing promptable segmentation in 3D-GS, the simplicity and effectiveness of SAGA pave the way for future advancements in this field. Our code will be released.
Problem

Research questions and friction points this paper is trying to address.

Efficient 3D promptable segmentation
Multi-granularity ambiguity in 3D
Real-time segmentation in 3D-GS
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D Gaussian Splatting based segmentation
Scale-gated affinity feature learning
Real-time multi-granularity 3D segmentation
🔎 Similar Papers
No similar papers found.
Jiazhong Cen
Jiazhong Cen
Shanghai Jiao Tong University
Computer vision3D Scene Understanding
Jiemin Fang
Jiemin Fang
Senior Researcher, Huawei
Neural Rendering3D VisionAutoMLNeural Architecture SearchComputer Vision
C
Chen Yang
MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University
L
Lingxi Xie
Huawei Inc.
X
Xiaopeng Zhang
Huawei Inc.
W
Wei Shen
MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University
Q
Qi Tian
Huawei Inc.