🤖 AI Summary
This study systematically evaluates the capabilities of generative AI in professional architectural design tasks—specifically technical drawing interpretation, 3D CAD model synthesis, and precise spatial assembly—requiring deep domain knowledge. Method: We employ multimodal prompting, 2D/3D architectural representation parsing, and automated CAD script generation, conducting the first comparative assessment of GPT-4o and Claude 3.5 on 3D reconstruction of Palladian architectural case studies, augmented with an iterative self-correction mechanism. Contribution/Results: Claude 3.5 significantly outperforms GPT-4o in spatial component assembly accuracy and self-revision capability; both models generate semantically plausible architectural elements but consistently fail to recover complex geometric constraints and topological relationships. The work identifies key potentials and structural limitations of multimodal large language models as architectural co-designers, establishing a novel evaluation paradigm for domain-specific AI in architecture.
📝 Abstract
Recent advancements in multimodal Generative AI have the potential to democratize specialized architectural tasks, such as interpreting technical drawings and creating 3D CAD models, which traditionally require expert knowledge. This paper presents a comparative evaluation of two systems: GPT-4o and Claude 3.5, in the task of architectural 3D synthesis. We conduct a case study on two buildings from Palladio's Four Books of Architecture (1965): Villa Rotonda and Palazzo Porto. High-level architectural models and drawings of these buildings were prepared, inspired by Palladio's original texts and drawings. Through sequential text and image prompting, we assess the systems' abilities in (1) interpreting 2D and 3D representations of buildings from drawings, (2) encoding the buildings into a CAD software script, and (3) self-improving based on outputs. While both systems successfully generate individual parts, they struggle to accurately assemble these parts into the desired spatial relationships, with Claude 3.5 demonstrating better performance, particularly in self-correcting its output. This study contributes to ongoing research on benchmarking the strengths and weaknesses of off-the-shelf AI systems in performing intelligent human tasks that require discipline-specific knowledge. The findings highlight the potential of language-enabled AI systems to act as collaborative technical assistants in the architectural design process.