🤖 AI Summary
This work addresses the high cost, low efficiency, and hallucination risks associated with directly applying large language models (LLMs) to engineering diagrams such as P&IDs. It introduces, for the first time, the GraphRAG paradigm to this domain by leveraging the DEXPI standard to convert intelligent P&IDs into structured knowledge graphs. The proposed ChatP&ID framework integrates multimodal retrieval strategies—including GraphRAG and ContextRAG—with LLM agents to enable efficient, reliable, and low-cost natural language interaction. Experimental results demonstrate an 18% accuracy improvement over raw image input, an 85% reduction in token cost compared to direct P&ID file parsing, and a 91% accuracy on GPT-4-Mini when combined with ContextRAG—at a cost of merely $0.004 per query—while also supporting engineering tasks such as HAZOP analysis.
📝 Abstract
Large Language Models (LLMs) combined with Retrieval-Augmented Generation (RAG) and knowledge graphs offer new opportunities for interacting with engineering diagrams such as Piping and Instrumentation Diagrams (P&IDs). However, directly processing raw images or smart P&ID files with LLMs is often costly, inefficient, and prone to hallucinations. This work introduces ChatP&ID, an agentic framework that enables grounded and cost-effective natural-language interaction with P&IDs using Graph Retrieval-Augmented Generation (GraphRAG), a paradigm we refer to as GraphRAG for engineering diagrams. Smart P&IDs encoded in the DEXPI standard are transformed into structured knowledge graphs, which serve as the basis for graph-based retrieval and reasoning by LLM agents. This approach enables reliable querying of engineering diagrams while significantly reducing computational cost. Benchmarking across commercial LLM APIs (OpenAI, Anthropic) demonstrates that graph-based representations improve accuracy by 18% over raw image inputs and reduce token costs by 85% compared to directly ingesting smart P&ID files. While small open-source models still struggle to interpret knowledge graph formats and structured engineering data, integrating them with VectorRAG and PathRAG improves response accuracy by up to 40%. Notably, GPT-5-mini combined with ContextRAG achieves 91% accuracy at a cost of only $0.004 per task. The resulting ChatP&ID interface enables intuitive natural-language interaction with complex engineering diagrams and lays the groundwork for AI-assisted process engineering tasks such as Hazard and Operability Studies (HAZOP) and multi-agent analysis.