🤖 AI Summary
Current AI systems struggle to comprehend the visual-structural knowledge embedded in engineering drawings, limiting their utility in scientific discovery. To address this gap, this work introduces Enginuity, the first open, large-scale, and multi-domain dataset of engineering drawings, accompanied by a novel structured annotation framework that explicitly captures hierarchical component relationships, connection patterns, and semantic elements. The dataset enables key tasks such as automated drawing parsing, cross-modal retrieval, and AI-assisted simulation, providing a foundational resource for training multimodal foundation models. By offering rich, structured supervision, Enginuity significantly advances the capacity of AI systems to understand and reason about engineering drawings, thereby facilitating their integration into scientific discovery workflows.
📝 Abstract
We propose Enginuity - the first open, large-scale, multi-domain engineering diagram dataset with comprehensive structural annotations designed for automated diagram parsing. By capturing hierarchical component relationships, connections, and semantic elements across diverse engineering domains, our proposed dataset would enable multimodal large language models to address critical downstream tasks including structured diagram parsing, cross-modal information retrieval, and AI-assisted engineering simulation. Enginuity would be transformative for AI for Scientific Discovery by enabling artificial intelligence systems to comprehend and manipulate the visual-structural knowledge embedded in engineering diagrams, breaking down a fundamental barrier that currently prevents AI from fully participating in scientific workflows where diagram interpretation, technical drawing analysis, and visual reasoning are essential for hypothesis generation, experimental design, and discovery.