🤖 AI Summary
CAD modeling remains highly manual, lacking multimodal interaction and automation support. Method: This paper introduces the first end-to-end image-to-parametric-CAD-command-sequence framework for editable and manufacturable 3D shape generation. It innovatively integrates CLIP-style contrastive representation learning, latent diffusion priors, and an autoregressive Transformer architecture to enable image-driven CAD command sequence generation with geometric constraint-aware decoding. Contributions/Results: (1) Generates topologically valid, parameter-tunable, and manufacturing-ready CAD models from a single input image; (2) Enables cross-modal CAD retrieval, improving image-to-model accuracy by 32.7% on large-scale CAD databases; (3) Outperforms all state-of-the-art methods on both unconditional and image-conditioned CAD generation benchmarks. This work advances AI-driven design-to-manufacturing closed-loop automation.
📝 Abstract
The creation of manufacturable and editable 3D shapes through Computer-Aided Design (CAD) remains a highly manual and time-consuming task, hampered by the complex topology of boundary representations of 3D solids and unintuitive design tools. While most work in the 3D shape generation literature focuses on representations like meshes, voxels, or point clouds, practical engineering applications demand the modifiability and manufacturability of CAD models and the ability for multi-modal conditional CAD model generation. This paper introduces GenCAD, a generative model that employs autoregressive transformers with a contrastive learning framework and latent diffusion models to transform image inputs into parametric CAD command sequences, resulting in editable 3D shape representations. Extensive evaluations demonstrate that GenCAD significantly outperforms existing state-of-the-art methods in terms of the unconditional and conditional generations of CAD models. Additionally, the contrastive learning framework of GenCAD facilitates the retrieval of CAD models using image queries from large CAD databases, which is a critical challenge within the CAD community. Our results provide a significant step forward in highlighting the potential of generative models to expedite the entire design-to-production pipeline and seamlessly integrate different design modalities.