GenCAD: Image-Conditioned Computer-Aided Design Generation with Transformer-Based Contrastive Representation and Diffusion Priors

📅 2024-09-08

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

195K/year

🤖 AI Summary

CAD modeling remains highly manual, lacking multimodal interaction and automation support. Method: This paper introduces the first end-to-end image-to-parametric-CAD-command-sequence framework for editable and manufacturable 3D shape generation. It innovatively integrates CLIP-style contrastive representation learning, latent diffusion priors, and an autoregressive Transformer architecture to enable image-driven CAD command sequence generation with geometric constraint-aware decoding. Contributions/Results: (1) Generates topologically valid, parameter-tunable, and manufacturing-ready CAD models from a single input image; (2) Enables cross-modal CAD retrieval, improving image-to-model accuracy by 32.7% on large-scale CAD databases; (3) Outperforms all state-of-the-art methods on both unconditional and image-conditioned CAD generation benchmarks. This work advances AI-driven design-to-manufacturing closed-loop automation.

Technology Category

Application Category

📝 Abstract

The creation of manufacturable and editable 3D shapes through Computer-Aided Design (CAD) remains a highly manual and time-consuming task, hampered by the complex topology of boundary representations of 3D solids and unintuitive design tools. While most work in the 3D shape generation literature focuses on representations like meshes, voxels, or point clouds, practical engineering applications demand the modifiability and manufacturability of CAD models and the ability for multi-modal conditional CAD model generation. This paper introduces GenCAD, a generative model that employs autoregressive transformers with a contrastive learning framework and latent diffusion models to transform image inputs into parametric CAD command sequences, resulting in editable 3D shape representations. Extensive evaluations demonstrate that GenCAD significantly outperforms existing state-of-the-art methods in terms of the unconditional and conditional generations of CAD models. Additionally, the contrastive learning framework of GenCAD facilitates the retrieval of CAD models using image queries from large CAD databases, which is a critical challenge within the CAD community. Our results provide a significant step forward in highlighting the potential of generative models to expedite the entire design-to-production pipeline and seamlessly integrate different design modalities.

Problem

Research questions and friction points this paper is trying to address.

Generating editable 3D CAD models from images

Enhancing modifiability and manufacturability of CAD designs

Enabling image-based retrieval of CAD models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based contrastive learning for CAD generation

Diffusion priors for multi-modal conditional generation

Image-to-CAD command sequence transformation

🔎 Similar Papers

Exploring the Potentials and Challenges of Deep Generative Models in Product Design Conception