🤖 AI Summary
To address the high deployment cost and poor localizability of closed-source large language models (LLMs) in 3D CAD generation, this paper proposes Seek-CAD—a training-free, locally deployable parametric CAD generation framework. Methodologically, it leverages the open-source LLM DeepSeek-R1 and pioneers the integration of vision-language model (VLM)-guided rendering evaluation with chain-of-thought (CoT)-driven iterative self-correction, establishing a closed-loop optimization pipeline. Concurrently, we introduce SSR-triple, the first structured CAD dataset explicitly designed for industrial instruction semantic alignment. Experimental results demonstrate that Seek-CAD generates high-fidelity, editable parametric CAD models without fine-tuning or reliance on closed-source models, achieving superior performance across geometric accuracy, topological consistency, and instruction adherence.
📝 Abstract
The advent of Computer-Aided Design (CAD) generative modeling will significantly transform the design of industrial products. The recent research endeavor has extended into the realm of Large Language Models (LLMs). In contrast to fine-tuning methods, training-free approaches typically utilize the advanced closed-source LLMs, thereby offering enhanced flexibility and efficiency in the development of AI agents for generating CAD parametric models. However, the substantial cost and limitations of local deployment of the top-tier closed-source LLMs pose challenges in practical applications. The Seek-CAD is the pioneer exploration of locally deployed open-source inference LLM DeepSeek-R1 for CAD parametric model generation with a training-free methodology. This study is the first investigation to incorporate both visual and Chain-of-Thought (CoT) feedback within the self-refinement mechanism for generating CAD models. Specifically, the initial generated parametric CAD model is rendered into a sequence of step-wise perspective images, which are subsequently processed by a Vision Language Model (VLM) alongside the corresponding CoTs derived from DeepSeek-R1 to assess the CAD model generation. Then, the feedback is utilized by DeepSeek-R1 to refine the initial generated model for the next round of generation. Moreover, we present an innovative 3D CAD model dataset structured around the SSR (Sketch, Sketch-based feature, and Refinements) triple design paradigm. This dataset encompasses a wide range of CAD commands, thereby aligning effectively with industrial application requirements and proving suitable for the generation of LLMs. Extensive experiments validate the effectiveness of Seek-CAD under various metrics.