🤖 AI Summary
Existing approaches for multi-hop question answering (QA) over schema-less knowledge graphs struggle with insufficient high-quality training data and ineffective modeling for three-hop or longer reasoning paths. Method: This paper proposes a novel framework integrating memory-augmented stepwise reasoning with multi-stage training. Specifically, (1) it constructs a lightweight, domain-agnostic general-purpose knowledge graph and automatically generates large-scale, diverse multi-hop QA pairs using rule-based heuristics and large language models; (2) it designs an external memory–enhanced architecture that explicitly models long-range relational paths; and (3) it jointly optimizes path retrieval and answer generation via supervised fine-tuning and reinforcement learning. Contribution/Results: Experiments demonstrate substantial improvements over state-of-the-art methods across multiple benchmarks—achieving a 12.6% absolute accuracy gain on 3+-hop questions—and robust cross-domain transferability, offering a scalable solution for low-resource multi-hop QA.
📝 Abstract
This paper introduces Omne-R1, a novel approach designed to enhance multi-hop question answering capabilities on schema-free knowledge graphs by integrating advanced reasoning models. Our method employs a multi-stage training workflow, including two reinforcement learning phases and one supervised fine-tuning phase. We address the challenge of limited suitable knowledge graphs and QA data by constructing domain-independent knowledge graphs and auto-generating QA pairs. Experimental results show significant improvements in answering multi-hop questions, with notable performance gains on more complex 3+ hop questions. Our proposed training framework demonstrates strong generalization abilities across diverse knowledge domains.