Inference Scaled GraphRAG: Improving Multi Hop Question Answering on Knowledge Graphs

📅 2025-06-24

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Large language models (LLMs) suffer from limited performance in knowledge-intensive multi-hop question answering due to their inability to explicitly model structured, higher-order relational context. Method: This paper proposes GraphRAG with inference-time expansion—a framework that integrates deep chain-of-thought (CoT) reasoning with majority voting across parallel sampling paths. Without modifying the LLM architecture, it interleaves retrieval, graph traversal, and reasoning steps to explicitly capture high-order structural dependencies among nodes in a knowledge graph. Contribution/Results: By synergistically combining sequential and parallel computational expansion strategies, the method efficiently captures multi-hop relational patterns. On the GRBench benchmark, it significantly outperforms conventional RAG, standard GraphRAG, and graph-traversal baselines, achieving an absolute 12.6% improvement in multi-hop QA accuracy—demonstrating the efficacy of inference-time structured expansion for knowledge-intensive reasoning.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have achieved impressive capabilities in language understanding and generation, yet they continue to underperform on knowledge-intensive reasoning tasks due to limited access to structured context and multi-hop information. Retrieval-Augmented Generation (RAG) partially mitigates this by grounding generation in retrieved context, but conventional RAG and GraphRAG methods often fail to capture relational structure across nodes in knowledge graphs. We introduce Inference-Scaled GraphRAG, a novel framework that enhances LLM-based graph reasoning by applying inference-time compute scaling. Our method combines sequential scaling with deep chain-of-thought graph traversal, and parallel scaling with majority voting over sampled trajectories within an interleaved reasoning-execution loop. Experiments on the GRBench benchmark demonstrate that our approach significantly improves multi-hop question answering performance, achieving substantial gains over both traditional GraphRAG and prior graph traversal baselines. These findings suggest that inference-time scaling is a practical and architecture-agnostic solution for structured knowledge reasoning with LLMs

Problem

Research questions and friction points this paper is trying to address.

Enhancing multi-hop question answering on knowledge graphs

Improving relational structure capture in GraphRAG methods

Scaling inference-time compute for structured knowledge reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Inference-time compute scaling enhances graph reasoning

Deep chain-of-thought traversal improves multi-hop QA

Parallel majority voting optimizes sampled trajectories

🔎 Similar Papers

FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering