๐ค AI Summary
This work proposes ProCAD, a novel framework that introduces an active clarification mechanism into text-to-CAD generation to address the prevalence of invalid outputs caused by ambiguous descriptions or conflicting constraints. ProCAD employs a collaborative architecture comprising a clarification agent and an encoding agent, which interactively resolve ambiguities through targeted queries before generating executable CadQuery programs, thereby ensuring self-consistent specifications. The clarification module is trained via agent-based supervised fine-tuning, while the CAD encoder leverages high-quality fine-tuned parameters for parametric modeling. Experimental results demonstrate that ProCAD substantially outperforms state-of-the-art closed-source models such as Claude Sonnet 4.5, reducing the average Chamfer distance by 79.9% and decreasing the rate of invalid outputs from 4.8% to 0.9%, thus significantly enhancing both accuracy and robustness in natural languageโdriven CAD generation.
๐ Abstract
Large language models have recently enabled text-to-CAD systems that synthesize parametric CAD programs (e.g., CadQuery) from natural language prompts. In practice, however, geometric descriptions can be under-specified or internally inconsistent: critical dimensions may be missing and constraints may conflict. Existing fine-tuned models tend to reactively follow user instructions and hallucinate dimensions when the text is ambiguous. To address this, we propose a proactive agentic framework for text-to-CadQuery generation, named ProCAD, that resolves specification issues before code synthesis. Our framework pairs a proactive clarifying agent, which audits the prompt and asks targeted clarification questions only when necessary to produce a self-consistent specification, with a CAD coding agent that translates the specification into an executable CadQuery program. We fine-tune the coding agent on a curated high-quality text-to-CadQuery dataset and train the clarifying agent via agentic SFT on clarification trajectories. Experiments show that proactive clarification significantly improves robustness to ambiguous prompts while keeping interaction overhead low. ProCAD outperforms frontier closed-source models, including Claude Sonnet 4.5, reducing the mean Chamfer distance by 79.9 percent and lowering the invalidity ratio from 4.8 percent to 0.9 percent. Our code and datasets will be made publicly available.