Interpretable Single-View 3D Gaussian Splatting using Unsupervised Hierarchical Disentangled Representation Learning

📅 2025-04-05
📈 Citations: 0
Influential: 0
📄 PDF

career value

217K/year
🤖 AI Summary
Existing 3D Gaussian Splatting (3DGS) methods lack explicit modeling of underlying 3D semantics, resulting in limited controllability and interpretability. To address this, we propose the first unsupervised, hierarchically disentangled single-view 3DGS framework. Our method employs a dual-branch architecture—comprising point cloud initialization and triplane-guided Gaussian generation—coupled with Disentangled Representation Learning (DRL) to jointly model geometric structure and appearance semantics. It achieves hierarchical semantic separation: coarse-grained (object parts) and fine-grained (material/texture) levels. An encoder adapter enables lightweight fine-tuning without additional annotations. Experiments demonstrate that our approach maintains state-of-the-art rendering quality and real-time performance while enabling independent geometric/appearance editing and semantics-driven manipulation. This significantly enhances model interpretability and generative controllability.

Technology Category

Application Category

📝 Abstract
Gaussian Splatting (GS) has recently marked a significant advancement in 3D reconstruction, delivering both rapid rendering and high-quality results. However, existing 3DGS methods pose challenges in understanding underlying 3D semantics, which hinders model controllability and interpretability. To address it, we propose an interpretable single-view 3DGS framework, termed 3DisGS, to discover both coarse- and fine-grained 3D semantics via hierarchical disentangled representation learning (DRL). Specifically, the model employs a dual-branch architecture, consisting of a point cloud initialization branch and a triplane-Gaussian generation branch, to achieve coarse-grained disentanglement by separating 3D geometry and visual appearance features. Subsequently, fine-grained semantic representations within each modality are further discovered through DRL-based encoder-adapters. To our knowledge, this is the first work to achieve unsupervised interpretable 3DGS. Evaluations indicate that our model achieves 3D disentanglement while preserving high-quality and rapid reconstruction.
Problem

Research questions and friction points this paper is trying to address.

Achieve unsupervised interpretable 3D Gaussian Splatting
Separate 3D geometry and appearance features
Discover hierarchical disentangled semantic representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised hierarchical disentangled representation learning
Dual-branch architecture for coarse-grained disentanglement
DRL-based encoder-adapters for fine-grained semantics
🔎 Similar Papers
No similar papers found.
Yuyang Zhang
Yuyang Zhang
Graduate Student, Harvard University
Reinforcement LearningControl Theory
B
Baao Xie
Ningbo Institute of Digital Twin, Eastern Institute of Technology, Zhejiang Key Laboratory of Industrial Intelligence and Digital Twin, Eastern Institute of Technology
Hu Zhu
Hu Zhu
College of Telecommunications and Information Engineering Nanjing University of Posts and
Computational Photography3D imagingtarget detectioninfrared imaging
Q
Qi Wang
Shanghai Jiao Tong University, Ningbo Institute of Digital Twin, Eastern Institute of Technology, Zhejiang Key Laboratory of Industrial Intelligence and Digital Twin, Eastern Institute of Technology
H
Huanting Guo
Ningbo Institute of Digital Twin, Eastern Institute of Technology, Zhejiang Key Laboratory of Industrial Intelligence and Digital Twin, Eastern Institute of Technology
X
Xin Jin
Ningbo Institute of Digital Twin, Eastern Institute of Technology, Zhejiang Key Laboratory of Industrial Intelligence and Digital Twin, Eastern Institute of Technology
W
Wenjun Zeng
Ningbo Institute of Digital Twin, Eastern Institute of Technology, Zhejiang Key Laboratory of Industrial Intelligence and Digital Twin, Eastern Institute of Technology