TreeGaussian: Tree-Guided Cascaded Contrastive Learning for Hierarchical Consistent 3D Gaussian Scene Segmentation and Understanding

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D Gaussian splatting methods struggle to model hierarchical semantic structures and part-whole relationships in complex scenes, and their reliance on 2D priors often leads to inconsistent cross-view labels, limiting segmentation performance. To address these limitations, this work proposes a tree-guided cascaded contrastive learning framework that explicitly constructs a multi-level object hierarchy to capture semantic structure. The framework incorporates a cascaded contrastive learning mechanism to reduce supervisory redundancy and integrates a consistency-aware segmentation refinement module with a graph neural network-based denoising component to enhance the stability and robustness of cross-view segmentation. Experiments demonstrate that the proposed method significantly outperforms existing approaches on open-vocabulary 3D object selection and point cloud understanding tasks, achieving superior segmentation consistency, quality, and structural awareness.
📝 Abstract
3D Gaussian Splatting (3DGS) has emerged as a real-time, differentiable representation for neural scene understanding. However, existing 3DGS-based methods struggle to represent hierarchical 3D semantic structures and capture whole-part relationships in complex scenes. Moreover, dense pairwise comparisons and inconsistent hierarchical labels from 2D priors hinder feature learning, resulting in suboptimal segmentation. To address these limitations, we introduce TreeGaussian, a tree-guided cascaded contrastive learning framework that explicitly models hierarchical semantic relationships and reduces redundancy in contrastive supervision. By constructing a multi-level object tree, TreeGaussian enables structured learning across object-part hierarchies. In addition, we propose a two-stage cascaded contrastive learning strategy that progressively refines feature representations from global to local, mitigating saturation and stabilizing training. A Consistent Segmentation Detection (CSD) mechanism and a graph-based denoising module are further introduced to align segmentation modes across views while suppressing unstable Gaussian points, enhancing segmentation consistency and quality. Extensive experiments, including open-vocabulary 3D object selection, 3D point cloud understanding, and ablation studies, demonstrate the effectiveness and robustness of our approach.
Problem

Research questions and friction points this paper is trying to address.

3D Gaussian Splatting
hierarchical semantic structure
scene segmentation
whole-part relationships
feature learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tree-Guided Learning
Cascaded Contrastive Learning
Hierarchical Segmentation
3D Gaussian Splatting
Consistent Scene Understanding
🔎 Similar Papers
No similar papers found.
J
Jingbin You
Institute of Computing Technology, Chinese Academy of Sciences, ICT; University of Chinese Academy of Sciences, UCAS
Zehao Li
Zehao Li
Peking University
Operations researchStochastic approximation
H
Hao Jiang
Institute of Computing Technology, Chinese Academy of Sciences, ICT; University of Chinese Academy of Sciences, UCAS
Xinzhu Ma
Xinzhu Ma
Associate Professor, Beihang University
deep learningcomputer vision3D scene understandingai4science
S
Shuqin Gao
Institute of Computing Technology, Chinese Academy of Sciences, ICT; University of Chinese Academy of Sciences, UCAS
H
Honglong Zhao
Institute of Computing Technology, Chinese Academy of Sciences, ICT; University of Chinese Academy of Sciences, UCAS
C
Congcong Zheng
Institute of Computing Technology, Chinese Academy of Sciences, ICT; University of Chinese Academy of Sciences, UCAS
T
Tianlu Mao
Institute of Computing Technology, Chinese Academy of Sciences, ICT; University of Chinese Academy of Sciences, UCAS
Feng Dai
Feng Dai
Institute of Computing Technology, Chinese Academy of Sciences
video coding and processingcomputational imaging
Yucheng Zhang
Yucheng Zhang
Purdue University
Knowledge GraphLarge Language Models
Z
Zhaoqi Wang
Institute of Computing Technology, Chinese Academy of Sciences, ICT; University of Chinese Academy of Sciences, UCAS