🤖 AI Summary
This work addresses the challenging problem of generating parametric CAD models directly from unconstrained real-world images, aiming to lower the barrier for digital twin construction and mitigate the scarcity of real CAD data. We propose a synthetic-data-driven paradigm: training exclusively on texture-free synthetic CAD data, leveraging a geometric feature encoder to achieve cross-domain generalization; and—novelly introducing Direct Preference Optimization (DPO) into CAD sequence generation, integrated with automated code validation for unsupervised geometric constraint learning. We introduce the first multi-view real CAD image–command pair dataset. On this benchmark, our method significantly outperforms existing approaches, demonstrating robustness to variations in illumination, viewpoint, and occlusion, and successfully generalizing to unseen object categories.
📝 Abstract
Creating CAD digital twins from the physical world is crucial for manufacturing, design, and simulation. However, current methods typically rely on costly 3D scanning with labor-intensive post-processing. To provide a user-friendly design process, we explore the problem of reverse engineering from unconstrained real-world CAD images that can be easily captured by users of all experiences. However, the scarcity of real-world CAD data poses challenges in directly training such models. To tackle these challenges, we propose CADCrafter, an image-to-parametric CAD model generation framework that trains solely on synthetic textureless CAD data while testing on real-world images. To bridge the significant representation disparity between images and parametric CAD models, we introduce a geometry encoder to accurately capture diverse geometric features. Moreover, the texture-invariant properties of the geometric features can also facilitate the generalization to real-world scenarios. Since compiling CAD parameter sequences into explicit CAD models is a non-differentiable process, the network training inherently lacks explicit geometric supervision. To impose geometric validity constraints, we employ direct preference optimization (DPO) to fine-tune our model with the automatic code checker feedback on CAD sequence quality. Furthermore, we collected a real-world dataset, comprised of multi-view images and corresponding CAD command sequence pairs, to evaluate our method. Experimental results demonstrate that our approach can robustly handle real unconstrained CAD images, and even generalize to unseen general objects.