Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method

📅 2025-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low efficiency and poor accuracy in multi-source information retrieval for open-domain complex question answering, this paper proposes Alignment-guided Retrieval for LLMs (ARM). ARM explicitly aligns large language model–based retrieval with the intrinsic structural properties of datasets—such as entity relationships and hierarchical organization—going beyond conventional semantic matching to enable cross-object relational modeling and eliminate iterative, dependency-based retrieval. Its core components include structure-aware query rewriting, data-organization graph modeling, and a joint alignment optimization framework. On the BIRD benchmark, ARM achieves absolute accuracy improvements of +5.2% over standard RAG and +15.9% over ReAct-style Agentic RAG; on OTT-QA, it yields +5.5 and +19.3 F1-point gains, respectively. These results demonstrate ARM’s effectiveness in advancing structured retrieval paradigms for complex question answering.

Technology Category

Application Category

📝 Abstract
Real-world open-domain questions can be complicated, particularly when answering them involves information from multiple information sources. LLMs have demonstrated impressive performance in decomposing complex tasks into simpler steps, and previous work has used it for better retrieval in support of complex questions. However, LLM's decomposition of questions is unaware of what data is available and how data is organized, often leading to a sub-optimal retrieval performance. Recent effort in agentic RAG proposes to perform retrieval in an iterative fashion, where a followup query is derived as an action based on previous rounds of retrieval. While this provides one way of interacting with the data collection, agentic RAG's exploration of data is inefficient because successive queries depend on previous results rather than being guided by the organization of available data in the collection. To address this problem, we propose an LLM-based retrieval method -- ARM, that aims to better align the question with the organization of the data collection by exploring relationships among data objects beyond matching the utterance of the query, thus leading to a retrieve-all-at-once solution for complex queries. We evaluated ARM on two datasets, Bird and OTT-QA. On Bird, it outperforms standard RAG with query decomposition by up to 5.2 pt in execution accuracy and agentic RAG (ReAct) by up to 15.9 pt. On OTT-QA, it achieves up to 5.5 pt and 19.3 pt higher F1 match scores compared to these approaches.
Problem

Research questions and friction points this paper is trying to address.

Complex Problem Solving
Information Retrieval Efficiency
Data Correlation
Innovation

Methods, ideas, or system contributions that make the work stand out.

ARM Methodology
Efficient Information Retrieval
Enhanced Precision and Recall
🔎 Similar Papers
No similar papers found.