🤖 AI Summary
This work addresses the frequent topological errors in symbolic representation generation for CAD objects. We propose EvoCAD, a framework that synergistically integrates vision-language models (e.g., GPT-4V/GPT-4o) for semantic understanding and generation with evolutionary algorithms for iterative structural optimization. Our key contribution is the introduction of two novel Euler-characteristic-based topological metrics, enabling—for the first time—the joint quantitative assessment of semantic similarity and topological correctness for 3D CAD objects. Evaluated on the CADPrompt benchmark, EvoCAD achieves state-of-the-art performance in generating topologically valid CAD symbolic representations, significantly improving topological fidelity over prior methods. The proposed metrics are both interpretable and computationally efficient, establishing a reliable, principled standard for evaluating CAD generation quality.
📝 Abstract
Combining large language models with evolutionary computation algorithms represents a promising research direction leveraging the remarkable generative and in-context learning capabilities of LLMs with the strengths of evolutionary algorithms. In this work, we present EvoCAD, a method for generating computer-aided design (CAD) objects through their symbolic representations using vision language models and evolutionary optimization. Our method samples multiple CAD objects, which are then optimized using an evolutionary approach with vision language and reasoning language models. We assess our method using GPT-4V and GPT-4o, evaluating it on the CADPrompt benchmark dataset and comparing it to prior methods. Additionally, we introduce two new metrics based on topological properties defined by the Euler characteristic, which capture a form of semantic similarity between 3D objects. Our results demonstrate that EvoCAD outperforms previous approaches on multiple metrics, particularly in generating topologically correct objects, which can be efficiently evaluated using our two novel metrics that complement existing spatial metrics.