🤖 AI Summary
Autoregressive 3D mesh generation faces an inherent trade-off between quality and speed: higher fidelity typically requires longer sequences or larger models, exacerbating sequential dependencies—thus hindering inference efficiency and incremental editing. This paper introduces the first training-free, plug-and-play retrieval-augmented framework that decouples sequence dependencies to enable part-level parallel generation and fusion. Our method integrates point cloud segmentation, spatial transformation, and registration to efficiently retrieve and geometrically adapt semantically and structurally similar components from a mesh repository, thereby augmenting the generator’s geometric priors. Evaluated on multiple state-of-the-art models, it achieves significant improvements in generation quality (−12.6% Chamfer Distance), 2.3× faster inference, and zero-shot local editing capability. The core contribution is breaking the longstanding “sequentiality–efficiency–controllability” triadic constraint, establishing a new paradigm for efficient and controllable 3D content generation.
📝 Abstract
3D meshes are a critical building block for applications ranging from industrial design and gaming to simulation and robotics. Traditionally, meshes are crafted manually by artists, a process that is time-intensive and difficult to scale. To automate and accelerate this asset creation, autoregressive models have emerged as a powerful paradigm for artistic mesh generation. However, current methods to enhance quality typically rely on larger models or longer sequences that result in longer generation time, and their inherent sequential nature imposes a severe quality-speed trade-off. This sequential dependency also significantly complicates incremental editing. To overcome these limitations, we propose Mesh RAG, a novel, training-free, plug-and-play framework for autoregressive mesh generation models. Inspired by RAG for language models, our approach augments the generation process by leveraging point cloud segmentation, spatial transformation, and point cloud registration to retrieve, generate, and integrate mesh components. This retrieval-based approach decouples generation from its strict sequential dependency, facilitating efficient and parallelizable inference. We demonstrate the wide applicability of Mesh RAG across various foundational autoregressive mesh generation models, showing it significantly enhances mesh quality, accelerates generation speed compared to sequential part prediction, and enables incremental editing, all without model retraining.