HierOctFusion: Multi-scale Octree-based 3D Shape Generation via Part-Whole-Hierarchy Message Passing

📅 2025-08-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing octree-based diffusion models treat 3D shapes as monolithic entities, ignoring semantic part–whole hierarchies, leading to poor generalization and high computational overhead. Method: We propose a multi-scale octree diffusion model featuring the first hierarchical part-to-whole message-passing mechanism for layered generation—from local parts to global structure. Specifically: (1) we design a part-aware cross-attention conditioning scheme to explicitly encode semantic hierarchy; (2) we construct the first octree-based 3D dataset with fine-grained part annotations; and (3) we integrate multi-scale feature interaction with cross-level message passing. Results: Experiments demonstrate significant improvements over state-of-the-art methods in both generation quality—especially for complex, sparse structures—and inference efficiency, validating the effectiveness and practicality of hierarchical sparse modeling.

Technology Category

Application Category

📝 Abstract
3D content generation remains a fundamental yet challenging task due to the inherent structural complexity of 3D data. While recent octree-based diffusion models offer a promising balance between efficiency and quality through hierarchical generation, they often overlook two key insights: 1) existing methods typically model 3D objects as holistic entities, ignoring their semantic part hierarchies and limiting generalization; and 2) holistic high-resolution modeling is computationally expensive, whereas real-world objects are inherently sparse and hierarchical, making them well-suited for layered generation. Motivated by these observations, we propose HierOctFusion, a part-aware multi-scale octree diffusion model that enhances hierarchical feature interaction for generating fine-grained and sparse object structures. Furthermore, we introduce a cross-attention conditioning mechanism that injects part-level information into the generation process, enabling semantic features to propagate effectively across hierarchical levels from parts to the whole. Additionally, we construct a 3D dataset with part category annotations using a pre-trained segmentation model to facilitate training and evaluation. Experiments demonstrate that HierOctFusion achieves superior shape quality and efficiency compared to prior methods.
Problem

Research questions and friction points this paper is trying to address.

Enhances 3D shape generation via part-whole-hierarchy modeling
Reduces computational cost with sparse hierarchical generation
Improves semantic feature propagation across multi-scale levels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Part-aware multi-scale octree diffusion model
Cross-attention conditioning for part-level generation
Hierarchical feature interaction for fine-grained structures
🔎 Similar Papers
X
Xinjie Gao
Wangxuan Institute of Computer Technology, Peking University
Bi'an Du
Bi'an Du
Peking University
3D Computer VisionGenerative Models for 3D
W
Wei Hu
Wangxuan Institute of Computer Technology, Peking University