MBA-RAG: a Bandit Approach for Adaptive Retrieval-Augmented Generation through Question Complexity

📅 2024-12-02

📈 Citations: 1

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Existing RAG methods employ rigid retrieval strategies ill-suited to queries of varying complexity, leading to suboptimal performance on knowledge-intensive tasks such as multi-hop reasoning. To address this, we propose the first Multi-Armed Bandit (MAB)-based adaptive Retrieval-Augmented Generation framework. Our method jointly models query complexity and employs reinforcement learning to dynamically balance retrieval precision and efficiency via a novel reward function—incorporating explicit step-wise penalties for excessive retrieval operations—to enable online, adaptive policy optimization. Evaluated across multiple single-hop and multi-hop benchmarks, our approach achieves state-of-the-art performance, significantly reducing average retrieval overhead while improving generation accuracy. The core contribution lies in the tight integration of query-complexity-aware modeling with MAB-driven dynamic decision-making, thereby transcending conventional static or heuristic retrieval paradigms.

Technology Category

Application Category

📝 Abstract

Retrieval Augmented Generation (RAG) has proven to be highly effective in boosting the generative performance of language model in knowledge-intensive tasks. However, existing RAG framework either indiscriminately perform retrieval or rely on rigid single-class classifiers to select retrieval methods, leading to inefficiencies and suboptimal performance across queries of varying complexity. To address these challenges, we propose a reinforcement learning-based framework that dynamically selects the most suitable retrieval strategy based on query complexity. % our solution Our approach leverages a multi-armed bandit algorithm, which treats each retrieval method as a distinct ``arm'' and adapts the selection process by balancing exploration and exploitation. Additionally, we introduce a dynamic reward function that balances accuracy and efficiency, penalizing methods that require more retrieval steps, even if they lead to a correct result. Our method achieves new state of the art results on multiple single-hop and multi-hop datasets while reducing retrieval costs. Our code are available at https://github.com/FUTUREEEEEE/MBA .

Problem

Research questions and friction points this paper is trying to address.

RAG Technology

Variable Difficulty Problems

Knowledge-Intensive Tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

MBA-RAG

Reinforcement Learning

Multi-Armed Bandit Algorithm

🔎 Similar Papers

MemoRAG: Boosting Long Context Processing with Global Memory-Enhanced Retrieval Augmentation