Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high deployment cost and poor localizability of closed-source large language models (LLMs) in 3D CAD generation, this paper proposes Seek-CAD—a training-free, locally deployable parametric CAD generation framework. Methodologically, it leverages the open-source LLM DeepSeek-R1 and pioneers the integration of vision-language model (VLM)-guided rendering evaluation with chain-of-thought (CoT)-driven iterative self-correction, establishing a closed-loop optimization pipeline. Concurrently, we introduce SSR-triple, the first structured CAD dataset explicitly designed for industrial instruction semantic alignment. Experimental results demonstrate that Seek-CAD generates high-fidelity, editable parametric CAD models without fine-tuning or reliance on closed-source models, achieving superior performance across geometric accuracy, topological consistency, and instruction adherence.

Technology Category

Application Category

📝 Abstract
The advent of Computer-Aided Design (CAD) generative modeling will significantly transform the design of industrial products. The recent research endeavor has extended into the realm of Large Language Models (LLMs). In contrast to fine-tuning methods, training-free approaches typically utilize the advanced closed-source LLMs, thereby offering enhanced flexibility and efficiency in the development of AI agents for generating CAD parametric models. However, the substantial cost and limitations of local deployment of the top-tier closed-source LLMs pose challenges in practical applications. The Seek-CAD is the pioneer exploration of locally deployed open-source inference LLM DeepSeek-R1 for CAD parametric model generation with a training-free methodology. This study is the first investigation to incorporate both visual and Chain-of-Thought (CoT) feedback within the self-refinement mechanism for generating CAD models. Specifically, the initial generated parametric CAD model is rendered into a sequence of step-wise perspective images, which are subsequently processed by a Vision Language Model (VLM) alongside the corresponding CoTs derived from DeepSeek-R1 to assess the CAD model generation. Then, the feedback is utilized by DeepSeek-R1 to refine the initial generated model for the next round of generation. Moreover, we present an innovative 3D CAD model dataset structured around the SSR (Sketch, Sketch-based feature, and Refinements) triple design paradigm. This dataset encompasses a wide range of CAD commands, thereby aligning effectively with industrial application requirements and proving suitable for the generation of LLMs. Extensive experiments validate the effectiveness of Seek-CAD under various metrics.
Problem

Research questions and friction points this paper is trying to address.

Locally deploy open-source LLM for CAD model generation
Incorporate visual and CoT feedback in self-refinement
Create dataset aligned with industrial CAD requirements
Innovation

Methods, ideas, or system contributions that make the work stand out.

Locally deployed open-source LLM DeepSeek-R1
Self-refinement with visual and CoT feedback
Novel SSR triple design paradigm dataset
🔎 Similar Papers
No similar papers found.
Xueyang Li
Xueyang Li
University of Notre Dame
Medical Image
J
Jiahao Li
School of Computer Science, Fudan University, Shanghai, China
Y
Yu Song
School of Computer Science, Fudan University, Shanghai, China
Y
Yunzhong Lou
School of Computer Science, Fudan University, Shanghai, China
X
Xiangdong Zhou
School of Computer Science, Fudan University, Shanghai, China