🤖 AI Summary
Traditional 3D generation methods produce non-editable meshes or point clouds, hindering iterative design and real-time modification. This work proposes Proc3D, a novel system that integrates procedural compact graph (PCG) representations with large language models to enable text-driven, editable 3D content generation. By leveraging in-context learning with GPT-4o and a fine-tuned LLaMA-3 model, Proc3D allows users to perform real-time parametric edits through natural language instructions, sliders, and checkboxes—without requiring full regeneration. The approach achieves over a 400-fold improvement in editing efficiency and a 28% gain in ULIP text-to-3D alignment scores, significantly outperforming existing methods.
📝 Abstract
Generating 3D models has traditionally been a complex task requiring specialized expertise. While recent advances in generative AI have sought to automate this process, existing methods produce non-editable representation, such as meshes or point clouds, limiting their adaptability for iterative design. In this paper, we introduce Proc3D, a system designed to generate editable 3D models while enabling real-time modifications. At its core, Proc3D introduces procedural compact graph (PCG), a graph representation of 3D models, that encodes the algorithmic rules and structures necessary for generating the model. This representation exposes key parameters, allowing intuitive manual adjustments via sliders and checkboxes, as well as real-time, automated modifications through natural language prompts using Large Language Models (LLMs). We demonstrate Proc3D's capabilities using two generative approaches: GPT-4o with in-context learning (ICL) and a fine-tuned LLAMA-3 model. Experimental results show that Proc3D outperforms existing methods in editing efficiency, achieving more than 400x speedup over conventional approaches that require full regeneration for each modification. Additionally, Proc3D improves ULIP scores by 28%, a metric that evaluates the alignment between generated 3D models and text prompts. By enabling text-aligned 3D model generation along with precise, real-time parametric edits, Proc3D facilitates highly accurate text-based image editing applications.