GRASP: Graph Agentic Search over Propositions for Multi-hop Question Answering

📅 2026-05-15

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the high computational cost of graph construction and excessive token consumption in existing intelligent retrieval methods for multi-hop question answering. The authors propose a dependency-aware dynamic sub-agent scheduling mechanism that enables collaborative retrieval across a three-layer entity-proposition-passage graph structure by decomposing complex queries into subtasks and dynamically planning execution paths. The approach integrates knowledge graph-guided agent retrieval, reciprocal ranking voting, and an efficiency-aware evaluation metric—termed “success economy”—to balance accuracy and resource usage. Evaluated under open-retrieval settings on MuSiQue and 2WikiMultihopQA, the method achieves state-of-the-art accuracy while reducing token consumption by 40–50%. It also outperforms prior systems on LongBench in both EM and F1 scores with a 30% reduction in token usage.

📝 Abstract

Agentic retrieval improves multi-hop question answering by giving language models autonomy to iteratively gather evidence. Recent work augments these systems with knowledge graphs for structured traversal, but this combination introduces significant cost: expensive graph construction at index time and compounding token usage at inference time. We introduce Graph Agentic Search over Propositions (GRASP), an agentic system that simultaneously optimizes for high accuracy and minimal token usage in multi-hop question answering. Rather than executing a rigid, singular query, GRASP actively coordinates its retrieval strategy by decomposing multi-hop queries into dependency-aware plans. This enables GRASP to dynamically scale the number of sub-agents according to the complexity of the problem. Each sub-agent resolves its single-hop query by exploring a novel three-layer hierarchical graph of entities, propositions, and passages, using the entity layer for targeted traversal and the proposition layer for high-recall passage retrieval via reciprocal-rank voting. We evaluate GRASP on MuSiQue, 2WikiMultihopQA, and HotpotQA under two settings: open-corpus retrieval and extended context reasoning (LongBench). GRASP achieves the highest QA accuracy in the open retrieval setting on MuSiQue and 2Wiki while using 40-50 percent fewer tokens than IRCoT+HippoRAG2. Furthermore, GRASP leads on EM and F1 across all three datasets in the LongBench setting while using 30 percent fewer tokens than the next most accurate method. Finally, we introduce success economy - the amortized token cost per correct answer, weighted by difficulty - and advocate for efficiency-aware evaluation as a standard practice for agentic QA.

Problem

Research questions and friction points this paper is trying to address.

multi-hop question answering

agentic retrieval

token efficiency

knowledge graph

retrieval cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

agentic retrieval

multi-hop question answering

hierarchical knowledge graph