Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing RAG frameworks suffer from concurrent redundancy and insufficiency in external knowledge retrieval: static strategies often cause over-retrieval or reasoning failure, while current adaptive methods rely solely on query complexity estimation and lack user controllability. This paper proposes a user-tunable dynamic accuracy–cost trade-off framework. It introduces a novel cooperative decision-making mechanism based on dual classifiers and an interpretable control parameter α, enabling on-demand switching between high-accuracy and low-overhead retrieval modes. By dynamically routing retrieval strategies and optimizing the RAG pipeline, our approach achieves Pareto-optimal balance between accuracy and retrieval cost across multiple benchmarks. Users can explicitly adjust α to customize the performance–efficiency trade-off, significantly enhancing deployment flexibility and human–AI collaboration capability.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) has emerged as a powerful approach to mitigate large language model (LLM) hallucinations by incorporating external knowledge retrieval. However, existing RAG frameworks often apply retrieval indiscriminately,leading to inefficiencies-over-retrieving when unnecessary or failing to retrieve iteratively when required for complex reasoning. Recent adaptive retrieval strategies, though adaptively navigates these retrieval strategies, predict only based on query complexity and lacks user-driven flexibility, making them infeasible for diverse user application needs. In this paper, we introduce a novel user-controllable RAG framework that enables dynamic adjustment of the accuracy-cost trade-off. Our approach leverages two classifiers: one trained to prioritize accuracy and another to prioritize retrieval efficiency. Via an interpretable control parameter $alpha$, users can seamlessly navigate between minimal-cost retrieval and high-accuracy retrieval based on their specific requirements. We empirically demonstrate that our approach effectively balances accuracy, retrieval cost, and user controllability, making it a practical and adaptable solution for real-world applications.
Problem

Research questions and friction points this paper is trying to address.

Balancing accuracy and cost
User-controllable retrieval strategies
Enhancing Retrieval-Augmented Generation efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

User-controllable RAG framework
Dynamic accuracy-cost adjustment
Two-priority classifier system
🔎 Similar Papers
No similar papers found.