🤖 AI Summary
This paper addresses the unclear correlation between query quality and final answer quality in Agentic Retrieval-Augmented Generation (RAG). It introduces Query Performance Prediction (QPP) to this paradigm for the first time, systematically investigating the retriever’s role within the question-answering chain. Empirical analysis on representative Agentic RAG models—including Search-R1 and R1-Searcher—reveals a strong positive correlation between QPP estimates and answer quality, demonstrating QPP’s effectiveness in quantifying retrieval utility. Moreover, high-performing retrievers are shown to improve answer accuracy *and* reduce reasoning steps. The core contributions are: (1) establishing QPP as a critical bridge linking retrieval quality to answer quality in Agentic RAG; (2) providing a learnable, evaluable quantitative basis for adaptive “retrieval-or-not” decisions; and (3) enabling the design and implementation of adaptive, computationally efficient retrieval mechanisms.
📝 Abstract
Agentic Retrieval-Augmented Generation (RAG) is a new paradigm where the reasoning model decides when to invoke a retriever (as a "tool") when answering a question. This paradigm, exemplified by recent research works such as Search-R1, enables the model to decide when to search and obtain external information. However, the queries generated by such Agentic RAG models and the role of the retriever in obtaining high-quality answers remain understudied. To this end, this initial study examines the applicability of query performance prediction (QPP) within the recent Agentic RAG models Search-R1 and R1-Searcher. We find that applying effective retrievers can achieve higher answer quality within a shorter reasoning process. Moreover, the QPP estimates of the generated queries, used as an approximation of their retrieval quality, are positively correlated with the quality of the final answer. Ultimately, our work is a step towards adaptive retrieval within Agentic RAG, where QPP is used to inform the model if the retrieved results are likely to be useful.