🤖 AI Summary
This work addresses the high latency in multi-step retrieval-augmented generation (Agentic RAG) systems caused by redundant search queries and the inherent trade-off between efficiency and sufficiency under fixed search depths. To overcome these limitations, the authors propose AutoSearch, a novel framework that, for the first time, integrates reinforcement learning with a self-answering mechanism to dynamically assess the necessity of each retrieval step and adaptively determine the minimal sufficient search depth. By employing reward modeling to stabilize search behavior, AutoSearch significantly outperforms existing methods across multiple benchmarks, reducing redundant retrievals while maintaining or even improving answer accuracy, thereby achieving a superior balance between efficiency and correctness.
📝 Abstract
Agentic retrieval-augmented generation (RAG) systems enable large language models (LLMs) to solve complex tasks through multi-step interaction with external retrieval tools. However, such multi-step interaction often involves redundant search steps, incurring substantial computational cost and latency. Prior work limits search depth (i.e., the number of search steps) to reduce cost, but this often leads to underexploration of complex questions. To address this, we first investigate how search depth affects accuracy and find a minimal sufficient search depth that defines an accuracy-efficiency trade-off, jointly determined by question complexity and the agent's capability. Furthermore, we propose AutoSearch, a reinforcement learning (RL) framework that evaluates each search step via self-generated intermediate answers. By a self-answering mechanism, AutoSearch identifies the minimal sufficient search depth and promotes efficient search by rewarding its attainment while penalizing over-searching. In addition, reward mechanisms are introduced to stabilize search behavior and improve answer quality on complex questions. Extensive experiments on multiple benchmarks show that AutoSearch achieves a superior accuracy-efficiency trade-off, alleviating over-searching while preserving search quality.