🤖 AI Summary
Existing single-image-driven 3D generation methods typically produce monolithic meshes, limiting part-level editing and failing to accommodate variable numbers of parts. To address this, we propose the first end-to-end part-aware 3D generation framework for single-image input. Our method introduces a dual implicit volumetric packing strategy, mapping individual parts into complementary implicit volume spaces to enable semantic completeness, geometric coherence, and flexible decoupling and interleaved assembly of an arbitrary number of parts. It integrates part-aware encoding, implicit volumetric modeling, and dual-space collaborative optimization. Evaluated on benchmarks including ShapeNet, our approach significantly outperforms prior image-driven part-based 3D generation methods in reconstruction accuracy, part plausibility, cross-category generalization, and edit controllability. This work establishes a new paradigm for editable 3D content creation from a single image.
📝 Abstract
Recent progress in 3D object generation has greatly improved both the quality and efficiency. However, most existing methods generate a single mesh with all parts fused together, which limits the ability to edit or manipulate individual parts. A key challenge is that different objects may have a varying number of parts. To address this, we propose a new end-to-end framework for part-level 3D object generation. Given a single input image, our method generates high-quality 3D objects with an arbitrary number of complete and semantically meaningful parts. We introduce a dual volume packing strategy that organizes all parts into two complementary volumes, allowing for the creation of complete and interleaved parts that assemble into the final object. Experiments show that our model achieves better quality, diversity, and generalization than previous image-based part-level generation methods.