MBA-RAG: a Bandit Approach for Adaptive Retrieval-Augmented Generation through Question Complexity

📅 2024-12-02
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing RAG methods employ rigid retrieval strategies ill-suited to queries of varying complexity, leading to suboptimal performance on knowledge-intensive tasks such as multi-hop reasoning. To address this, we propose the first Multi-Armed Bandit (MAB)-based adaptive Retrieval-Augmented Generation framework. Our method jointly models query complexity and employs reinforcement learning to dynamically balance retrieval precision and efficiency via a novel reward function—incorporating explicit step-wise penalties for excessive retrieval operations—to enable online, adaptive policy optimization. Evaluated across multiple single-hop and multi-hop benchmarks, our approach achieves state-of-the-art performance, significantly reducing average retrieval overhead while improving generation accuracy. The core contribution lies in the tight integration of query-complexity-aware modeling with MAB-driven dynamic decision-making, thereby transcending conventional static or heuristic retrieval paradigms.

Technology Category

Application Category

📝 Abstract
Retrieval Augmented Generation (RAG) has proven to be highly effective in boosting the generative performance of language model in knowledge-intensive tasks. However, existing RAG framework either indiscriminately perform retrieval or rely on rigid single-class classifiers to select retrieval methods, leading to inefficiencies and suboptimal performance across queries of varying complexity. To address these challenges, we propose a reinforcement learning-based framework that dynamically selects the most suitable retrieval strategy based on query complexity. % our solution Our approach leverages a multi-armed bandit algorithm, which treats each retrieval method as a distinct ``arm'' and adapts the selection process by balancing exploration and exploitation. Additionally, we introduce a dynamic reward function that balances accuracy and efficiency, penalizing methods that require more retrieval steps, even if they lead to a correct result. Our method achieves new state of the art results on multiple single-hop and multi-hop datasets while reducing retrieval costs. Our code are available at https://github.com/FUTUREEEEEE/MBA .
Problem

Research questions and friction points this paper is trying to address.

RAG Technology
Variable Difficulty Problems
Knowledge-Intensive Tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

MBA-RAG
Reinforcement Learning
Multi-Armed Bandit Algorithm
🔎 Similar Papers
No similar papers found.
Xiaqiang Tang
Xiaqiang Tang
HKUST(GZ)
LLMRAGTrustworthy AI
Q
Q. Gao
Tencent Hunyuan, Wuhan University
J
Jian Li
Tencent Hunyuan
N
Nan Du
Tencent Hunyuan
Q
Qi Li
Iowa State University
Sihong Xie
Sihong Xie
Associate Professor at AI Thrust, Information Hub, HKUST-GZ
data miningmachine learning