Omne-R1: Learning to Reason with Memory for Multi-hop Question Answering

📅 2025-08-24

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Existing approaches for multi-hop question answering (QA) over schema-less knowledge graphs struggle with insufficient high-quality training data and ineffective modeling for three-hop or longer reasoning paths. Method: This paper proposes a novel framework integrating memory-augmented stepwise reasoning with multi-stage training. Specifically, (1) it constructs a lightweight, domain-agnostic general-purpose knowledge graph and automatically generates large-scale, diverse multi-hop QA pairs using rule-based heuristics and large language models; (2) it designs an external memory–enhanced architecture that explicitly models long-range relational paths; and (3) it jointly optimizes path retrieval and answer generation via supervised fine-tuning and reinforcement learning. Contribution/Results: Experiments demonstrate substantial improvements over state-of-the-art methods across multiple benchmarks—achieving a 12.6% absolute accuracy gain on 3+-hop questions—and robust cross-domain transferability, offering a scalable solution for low-resource multi-hop QA.

Technology Category

Application Category

📝 Abstract

This paper introduces Omne-R1, a novel approach designed to enhance multi-hop question answering capabilities on schema-free knowledge graphs by integrating advanced reasoning models. Our method employs a multi-stage training workflow, including two reinforcement learning phases and one supervised fine-tuning phase. We address the challenge of limited suitable knowledge graphs and QA data by constructing domain-independent knowledge graphs and auto-generating QA pairs. Experimental results show significant improvements in answering multi-hop questions, with notable performance gains on more complex 3+ hop questions. Our proposed training framework demonstrates strong generalization abilities across diverse knowledge domains.

Problem

Research questions and friction points this paper is trying to address.

Enhancing multi-hop question answering on schema-free knowledge graphs

Addressing limited suitable knowledge graphs and QA data availability

Improving performance on complex multi-hop reasoning tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-stage reinforcement learning training workflow

Auto-generating domain-independent knowledge graphs

Enhanced reasoning for complex multi-hop questions

🔎 Similar Papers

Do Large Language Models Latently Perform Multi-Hop Reasoning?