π€ AI Summary
This work addresses the scarcity of high-quality, structured training data that hinders 3D content generation by introducing an open-source 3D asset ecosystem. It proposes a hybrid data strategy that integrates real-world, high-fidelity 3D objects with AI-generated assets covering long-tail categories, enriched with part-level semantic annotations to enable fine-grained editing and perception. Leveraging high-fidelity mesh processing, multi-view rendering, and scalable AIGC-based synthesis techniques, the project releases a large-scale dataset comprising 250,000 real and 125,000 synthetic 3D assets. This dataset effectively supports the training of the Hunyuan3D-2.1-Small model, significantly advancing the application of 3D generative models across multiple domains.
π Abstract
While recent advances in neural representations and generative models have revolutionized 3D content creation, the field remains constrained by significant data processing bottlenecks. To address this, we introduce HY3D-Bench, an open-source ecosystem designed to establish a unified, high-quality foundation for 3D generation. Our contributions are threefold: (1) We curate a library of 250k high-fidelity 3D objects distilled from large-scale repositories, employing a rigorous pipeline to deliver training-ready artifacts, including watertight meshes and multi-view renderings; (2) We introduce structured part-level decomposition, providing the granularity essential for fine-grained perception and controllable editing; and (3) We bridge real-world distribution gaps via a scalable AIGC synthesis pipeline, contributing 125k synthetic assets to enhance diversity in long-tail categories. Validated empirically through the training of Hunyuan3D-2.1-Small, HY3D-Bench democratizes access to robust data resources, aiming to catalyze innovation across 3D perception, robotics, and digital content creation.