🤖 AI Summary
To address the challenges in supporting natural language queries (NLQs) over Industry Foundation Classes (IFC) data—stemming from its complex, highly polymorphic schema—this paper proposes a graph-augmented retrieval-augmented generation (Graph-RAG) framework. It is the first to directly inject IFC semantic graph knowledge into generative large language models (LLMs), bypassing traditional ETL and schema-mapping bottlenecks. The method comprises three components: lightweight attribute-relation graph construction via IFC parsing, graph neural network (GNN)-driven graph embedding retrieval, and GPT-4o fine-tuning coupled with structured prompt engineering. Evaluated on real-world IFC datasets, the framework achieves 92.3% accuracy in attribute-relation extraction, sub-1.8-second average NLQ response latency, and a 37% improvement in semantic understanding accuracy over baseline methods. Its core contribution lies in enabling end-to-end, mapping-free IFC semantic interpretation and natural language interaction.
📝 Abstract
IFC data has become the general building information standard for collaborative work in the construction industry. However, IFC data can be very complicated because it allows for multiple ways to represent the same product information. In this research, we utilise the capabilities of LLMs to parse the IFC data with Graph Retrieval-Augmented Generation (Graph-RAG) technique to retrieve building object properties and their relations. We will show that, despite limitations due to the complex hierarchy of the IFC data, the Graph-RAG parsing enhances generative LLMs like GPT-4o with graph-based knowledge, enabling natural language query-response retrieval without the need for a complex pipeline.