Orion-RAG: Path-Aligned Hybrid Retrieval for Graphless Data

๐Ÿ“… 2026-01-08
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge posed by discrete, fragmented dataโ€”such as reports and logsโ€”that lack explicit inter-document relationships, thereby hindering traditional Retrieval-Augmented Generation (RAG) systems from performing effective cross-document retrieval. To overcome this limitation, the authors propose a lightweight path-alignment hybrid retrieval method that obviates the need for explicit knowledge graph construction. By leveraging a low-complexity strategy, the approach automatically uncovers latent semantic pathways among isolated documents, transforming unstructured data into a semi-structured format to enable efficient, interpretable, and real-time-updatable cross-document linking. Integrated with a path-alignment mechanism, a hybrid retrieval strategy, and a Human-in-the-Loop validation framework, the method significantly outperforms state-of-the-art RAG systems on benchmarks such as FinanceBench, achieving a 25.2% relative improvement in precision over baseline models while maintaining high cost-effectiveness and practical utility.

Technology Category

Application Category

๐Ÿ“ Abstract
Retrieval-Augmented Generation (RAG) has proven effective for knowledge synthesis, yet it encounters significant challenges in practical scenarios where data is inherently discrete and fragmented. In most environments, information is distributed across isolated files like reports and logs that lack explicit links. Standard search engines process files independently, ignoring the connections between them. Furthermore, manually building Knowledge Graphs is impractical for such vast data. To bridge this gap, we present Orion-RAG. Our core insight is simple yet effective: we do not need heavy algorithms to organize this data. Instead, we use a low-complexity strategy to extract lightweight paths that naturally link related concepts. We demonstrate that this streamlined approach suffices to transform fragmented documents into semi-structured data, enabling the system to link information across different files effectively. Extensive experiments demonstrate that Orion-RAG consistently outperforms mainstream frameworks across diverse domains, supporting real-time updates and explicit Human-in-the-Loop verification with high cost-efficiency. Experiments on FinanceBench demonstrate superior precision with a 25.2% relative improvement over strong baselines.
Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation
fragmented data
graphless data
knowledge synthesis
information linking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Path-Aligned Retrieval
Graphless Data
Lightweight Path Extraction
Retrieval-Augmented Generation
Semi-Structured Data
๐Ÿ”Ž Similar Papers
No similar papers found.