CADEvolve: Creating Realistic CAD via Program Evolution

📅 2026-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing public CAD datasets, which often lack complex operations, multi-step combinations, and explicit design intent, thereby hindering the generalization of AI models in real-world industrial settings. To overcome this, we propose a novel generation framework that integrates program evolution with vision-language model (VLM) guidance. Starting from simple geometric primitives, our approach iteratively constructs executable CadQuery programs of industrial-level complexity through VLM-driven editing and validation. This is the first method to combine program evolution with VLM-based reasoning to produce a high-quality, parameterized CAD dataset encompassing the full CadQuery operation set, enhanced by multi-stage post-processing and data augmentation. The resulting dataset comprises 1.3 million scripts and achieves state-of-the-art performance on established Image2CAD benchmarks, including DeepCAD, Fusion 360, and MCB.

Technology Category

Application Category

📝 Abstract
Computer-Aided Design (CAD) delivers rapid, editable modeling for engineering and manufacturing. Recent AI progress now makes full automation feasible for various CAD tasks. However, progress is bottlenecked by data: public corpora mostly contain sketch-extrude sequences, lack complex operations, multi-operation composition and design intent, and thus hinder effective fine-tuning. Attempts to bypass this with frozen VLMs often yield simple or invalid programs due to limited 3D grounding in current foundation models. We present CADEvolve, an evolution-based pipeline and dataset that starts from simple primitives and, via VLM-guided edits and validations, incrementally grows CAD programs toward industrial-grade complexity. The result is 8k complex parts expressed as executable CadQuery parametric generators. After multi-stage post-processing and augmentation, we obtain a unified dataset of 1.3m scripts paired with rendered geometry and exercising the full CadQuery operation set. A VLM fine-tuned on CADEvolve achieves state-of-the-art results on the Image2CAD task across the DeepCAD, Fusion 360, and MCB benchmarks.
Problem

Research questions and friction points this paper is trying to address.

CAD dataset
complex operations
design intent
3D grounding
program generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

program evolution
CAD generation
visual language model
parametric modeling
dataset synthesis
🔎 Similar Papers
No similar papers found.