Omne-R1: Learning to Reason with Memory for Multi-hop Question Answering

📅 2025-08-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing approaches for multi-hop question answering (QA) over schema-less knowledge graphs struggle with insufficient high-quality training data and ineffective modeling for three-hop or longer reasoning paths. Method: This paper proposes a novel framework integrating memory-augmented stepwise reasoning with multi-stage training. Specifically, (1) it constructs a lightweight, domain-agnostic general-purpose knowledge graph and automatically generates large-scale, diverse multi-hop QA pairs using rule-based heuristics and large language models; (2) it designs an external memory–enhanced architecture that explicitly models long-range relational paths; and (3) it jointly optimizes path retrieval and answer generation via supervised fine-tuning and reinforcement learning. Contribution/Results: Experiments demonstrate substantial improvements over state-of-the-art methods across multiple benchmarks—achieving a 12.6% absolute accuracy gain on 3+-hop questions—and robust cross-domain transferability, offering a scalable solution for low-resource multi-hop QA.

Technology Category

Application Category

📝 Abstract
This paper introduces Omne-R1, a novel approach designed to enhance multi-hop question answering capabilities on schema-free knowledge graphs by integrating advanced reasoning models. Our method employs a multi-stage training workflow, including two reinforcement learning phases and one supervised fine-tuning phase. We address the challenge of limited suitable knowledge graphs and QA data by constructing domain-independent knowledge graphs and auto-generating QA pairs. Experimental results show significant improvements in answering multi-hop questions, with notable performance gains on more complex 3+ hop questions. Our proposed training framework demonstrates strong generalization abilities across diverse knowledge domains.
Problem

Research questions and friction points this paper is trying to address.

Enhancing multi-hop question answering on schema-free knowledge graphs
Addressing limited suitable knowledge graphs and QA data availability
Improving performance on complex multi-hop reasoning tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-stage reinforcement learning training workflow
Auto-generating domain-independent knowledge graphs
Enhanced reasoning for complex multi-hop questions
🔎 Similar Papers
No similar papers found.
B
Boyuan Liu
Tanka AI team, Tanka Inc.
F
Feng Ji
Tanka AI team, Tanka Inc.
J
Jiayan Nan
Tanka AI team, Tanka Inc.
H
Han Zhao
Tanka AI team, Tanka Inc.
W
Weiling Chen
Tanka AI team, Tanka Inc.
S
Shihao Xu
Tanka AI team, Tanka Inc.
Xing Zhou
Xing Zhou
Computer Science, University of Illinois at Urbana-Champaign
Compiler Optimizations