🤖 AI Summary
Natural language question answering over knowledge graphs (KGs) suffers from unreliable reasoning due to linguistic ambiguity and the sheer scale of KGs. Method: This paper proposes *Thought Grounding*—a paradigm that explicitly anchors each step of large language model (LLM) reasoning—including chain-of-thought (CoT), tree-of-thought (ToT), and graph-of-thought (GoT)—to structured KG data, ensuring traceability and verifiability. It introduces the first systematic integration of multi-granularity KG retrieval (agent-assisted and automated search), KG embeddings, graph traversal, and a multi-strategy reasoning framework. Contribution/Results: Evaluated on the GRBench graph reasoning benchmark, our approach significantly outperforms non-grounded baselines, achieving substantial improvements in both reasoning accuracy and factual consistency. These results empirically validate KGs’ dual role in constraining and enhancing LLM reasoning processes.
📝 Abstract
Knowledge Graphs (KGs) are valuable tools for representing relationships between entities in a structured format. Traditionally, these knowledge bases are queried to extract specific information. However, question-answering (QA) over such KGs poses a challenge due to the intrinsic complexity of natural language compared to the structured format and the size of these graphs. Despite these challenges, the structured nature of KGs can provide a solid foundation for grounding the outputs of Large Language Models (LLMs), offering organizations increased reliability and control. Recent advancements in LLMs have introduced reasoning methods at inference time to improve their performance and maximize their capabilities. In this work, we propose integrating these reasoning strategies with KGs to anchor every step or"thought"of the reasoning chains in KG data. Specifically, we evaluate both agentic and automated search methods across several reasoning strategies, including Chain-of-Thought (CoT), Tree-of-Thought (ToT), and Graph-of-Thought (GoT), using GRBench, a benchmark dataset for graph reasoning with domain-specific graphs. Our experiments demonstrate that this approach consistently outperforms baseline models, highlighting the benefits of grounding LLM reasoning processes in structured KG data.