🤖 AI Summary
In knowledge graph reasoning (KGR), existing large language model (LLM)-based approaches suffer from severe path noise interference and high LLM invocation overhead. To address these challenges, we propose PathMind, a novel “retrieve–prioritize–reason” framework. First, it retrieves candidate reasoning paths via subgraph-aware retrieval. Second, it introduces a semantic-aware path priority function that jointly models cumulative and prospective costs to dynamically rank path importance and filter critical paths. Third, it employs a two-stage instruction tuning strategy with path-level preference alignment to enhance logical consistency and inference efficiency. PathMind significantly reduces noise from irrelevant paths and lowers LLM call frequency. Empirical results across multiple benchmarks demonstrate superior accuracy with fewer input tokens—particularly on complex multi-hop reasoning tasks—establishing new state-of-the-art performance while improving computational efficiency.
📝 Abstract
Knowledge graph reasoning (KGR) is the task of inferring new knowledge by performing logical deductions on knowledge graphs. Recently, large language models (LLMs) have demonstrated remarkable performance in complex reasoning tasks. Despite promising success, current LLM-based KGR methods still face two critical limitations. First, existing methods often extract reasoning paths indiscriminately, without assessing their different importance, which may introduce irrelevant noise that misleads LLMs. Second, while many methods leverage LLMs to dynamically explore potential reasoning paths, they require high retrieval demands and frequent LLM calls. To address these limitations, we propose PathMind, a novel framework designed to enhance faithful and interpretable reasoning by selectively guiding LLMs with important reasoning paths. Specifically, PathMind follows a "Retrieve-Prioritize-Reason" paradigm. First, it retrieves a query subgraph from KG through the retrieval module. Next, it introduces a path prioritization mechanism that identifies important reasoning paths using a semantic-aware path priority function, which simultaneously considers the accumulative cost and the estimated future cost for reaching the target. Finally, PathMind generates accurate and logically consistent responses via a dual-phase training strategy, including task-specific instruction tuning and path-wise preference alignment. Extensive experiments on benchmark datasets demonstrate that PathMind consistently outperforms competitive baselines, particularly on complex reasoning tasks with fewer input tokens, by identifying essential reasoning paths.