Binary-Gaussian: Compact and Progressive Representation for 3D Gaussian Segmentation

๐Ÿ“… 2025-11-30
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the challenges of high memory consumption, label-space congestion, and difficulty in fine-grained segmentation arising from high-dimensional features in 3D Gaussian Splatting (3D-GS) semantic segmentation, this paper proposes a binary encoding and progressive learning framework. Methodologically, it introduces (1) a binary-to-decimal mapping to compress categorical features, drastically reducing GPU memory footprint; (2) a layer-wise decoding scheme employing coarse-to-fine binary representations to mitigate label conflicts; and (3) multi-stage independent subtask training coupled with opacity-aware joint fine-tuning to decouple rendering fidelity and semantic segmentation optimization. Evaluated on multiple benchmarks, the method achieves state-of-the-art segmentation accuracy while reducing GPU memory usage by 42% and accelerating inference by 3.1ร—โ€”demonstrating a favorable trade-off between computational efficiency and fine-grained semantic expressiveness.

Technology Category

Application Category

๐Ÿ“ Abstract
3D Gaussian Splatting (3D-GS) has emerged as an efficient 3D representation and a promising foundation for semantic tasks like segmentation. However, existing 3D-GS-based segmentation methods typically rely on high-dimensional category features, which introduce substantial memory overhead. Moreover, fine-grained segmentation remains challenging due to label space congestion and the lack of stable multi-granularity control mechanisms. To address these limitations, we propose a coarse-to-fine binary encoding scheme for per-Gaussian category representation, which compresses each feature into a single integer via the binary-to-decimal mapping, drastically reducing memory usage. We further design a progressive training strategy that decomposes panoptic segmentation into a series of independent sub-tasks, reducing inter-class conflicts and thereby enhancing fine-grained segmentation capability. Additionally, we fine-tune opacity during segmentation training to address the incompatibility between photometric rendering and semantic segmentation, which often leads to foreground-background confusion. Extensive experiments on multiple benchmarks demonstrate that our method achieves state-of-the-art segmentation performance while significantly reducing memory consumption and accelerating inference.
Problem

Research questions and friction points this paper is trying to address.

Reduces memory overhead in 3D Gaussian segmentation
Enhances fine-grained segmentation with progressive training
Addresses foreground-background confusion via opacity fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Binary encoding compresses features into single integers
Progressive training decomposes segmentation into independent sub-tasks
Opacity fine-tuning resolves foreground-background confusion
๐Ÿ”Ž Similar Papers
An Yang
An Yang
Qwen Team, Peking University
Nature Language Processing (NLP)
C
Chenyu Liu
iFLYTEK Research
J
Jun Du
NERC-SLIP, University of Science and Technology of China
J
Jianqing Gao
iFLYTEK Research
J
Jia Pan
iFLYTEK Research
J
Jinshui Hu
iFLYTEK Research
Baocai Yin
Baocai Yin
Unknown affiliation
Bing Yin
Bing Yin
Amazon.com
NLPInformation RetrievalDeep LearningKnowledge Graphs
C
Cong Liu
iFLYTEK Research