Interpretable Single-View 3D Gaussian Splatting using Unsupervised Hierarchical Disentangled Representation Learning

📅 2025-04-05

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Existing 3D Gaussian Splatting (3DGS) methods lack explicit modeling of underlying 3D semantics, resulting in limited controllability and interpretability. To address this, we propose the first unsupervised, hierarchically disentangled single-view 3DGS framework. Our method employs a dual-branch architecture—comprising point cloud initialization and triplane-guided Gaussian generation—coupled with Disentangled Representation Learning (DRL) to jointly model geometric structure and appearance semantics. It achieves hierarchical semantic separation: coarse-grained (object parts) and fine-grained (material/texture) levels. An encoder adapter enables lightweight fine-tuning without additional annotations. Experiments demonstrate that our approach maintains state-of-the-art rendering quality and real-time performance while enabling independent geometric/appearance editing and semantics-driven manipulation. This significantly enhances model interpretability and generative controllability.

Technology Category

Application Category

📝 Abstract

Gaussian Splatting (GS) has recently marked a significant advancement in 3D reconstruction, delivering both rapid rendering and high-quality results. However, existing 3DGS methods pose challenges in understanding underlying 3D semantics, which hinders model controllability and interpretability. To address it, we propose an interpretable single-view 3DGS framework, termed 3DisGS, to discover both coarse- and fine-grained 3D semantics via hierarchical disentangled representation learning (DRL). Specifically, the model employs a dual-branch architecture, consisting of a point cloud initialization branch and a triplane-Gaussian generation branch, to achieve coarse-grained disentanglement by separating 3D geometry and visual appearance features. Subsequently, fine-grained semantic representations within each modality are further discovered through DRL-based encoder-adapters. To our knowledge, this is the first work to achieve unsupervised interpretable 3DGS. Evaluations indicate that our model achieves 3D disentanglement while preserving high-quality and rapid reconstruction.

Problem

Research questions and friction points this paper is trying to address.

Achieve unsupervised interpretable 3D Gaussian Splatting

Separate 3D geometry and appearance features

Discover hierarchical disentangled semantic representations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised hierarchical disentangled representation learning

Dual-branch architecture for coarse-grained disentanglement

DRL-based encoder-adapters for fine-grained semantics

🔎 Similar Papers

No similar papers found.