🤖 AI Summary
Real-world CAD design requires the collaborative assembly of multiple components under both semantic and geometric constraints—capabilities unmet by existing single-component generation methods. This paper introduces the first end-to-end framework for generative, assembly-aware multi-part CAD modeling. Our approach comprises three key contributions: (1) construction of KnitCAD, a large-scale, aligned CAD-text-metadata dataset containing over 310K samples; (2) a geometry-guided diffusion sampling strategy that jointly incorporates cross-modal text-shape alignment, CAD topology-aware representation learning, and explicit geometric constraint encoding; and (3) support for controllable, high-fidelity assembly generation—enabling simultaneous synthesis of semantically coherent, geometrically compatible, and editable multi-part CAD models from text prompts and given components. Experiments demonstrate significant improvements over state-of-the-art methods on multi-part assembly generation, achieving, for the first time, engineering-constrained, controllable, and editable CAD synthesis.
📝 Abstract
Crafting computer-aided design (CAD) models has long been a painstaking and time-intensive task, demanding both precision and expertise from designers. With the emergence of 3D generation, this task has undergone a transformative impact, shifting not only from visual fidelity to functional utility but also enabling editable CAD designs. Prior works have achieved early success in single-part CAD generation, which is not well-suited for real-world applications, as multiple parts need to be assembled under semantic and geometric constraints. In this paper, we propose CADKnitter, a compositional CAD generation framework with a geometry-guided diffusion sampling strategy. CADKnitter is able to generate a complementary CAD part that follows both the geometric constraints of the given CAD model and the semantic constraints of the desired design text prompt. We also curate a dataset, so-called KnitCAD, containing over 310,000 samples of CAD models, along with textual prompts and assembly metadata that provide semantic and geometric constraints. Intensive experiments demonstrate that our proposed method outperforms other state-of-the-art baselines by a clear margin.