🤖 AI Summary
To address the prohibitively high computational cost of large language model (LLM)-dependent knowledge graph construction in Graph-Augmented Retrieval-Augmented Generation (Graph-RAG), this paper proposes an LLM-free conceptual graph construction framework. First, it identifies discriminative key concepts from documents—those most salient for retrieval—then builds a lightweight conceptual graph leveraging concept co-occurrence and semantic association, enabling zero-cost knowledge completion. Furthermore, we introduce a graph-guided chunk filtering mechanism to support efficient multi-hop reasoning. Crucially, the entire graph construction and completion process requires no LLM invocation. Evaluated on multiple real-world datasets, our method reduces LLM call overhead by 92% on average compared to LLM-based baselines, while consistently outperforming state-of-the-art approaches in both retrieval accuracy and answer quality.
📝 Abstract
Graph-based RAG constructs a knowledge graph (KG) from text chunks to enhance retrieval in Large Language Model (LLM)-based question answering. It is especially beneficial in domains such as biomedicine, law, and political science, where effective retrieval often involves multi-hop reasoning over proprietary documents. However, these methods demand numerous LLM calls to extract entities and relations from text chunks, incurring prohibitive costs at scale. Through a carefully designed ablation study, we observe that certain words (termed concepts) and their associated documents are more important. Based on this insight, we propose Graph-Guided Concept Selection (G2ConS). Its core comprises a chunk selection method and an LLM-independent concept graph. The former selects salient document chunks to reduce KG construction costs; the latter closes knowledge gaps introduced by chunk selection at zero cost. Evaluations on multiple real-world datasets show that G2ConS outperforms all baselines in construction cost, retrieval effectiveness, and answering quality.