🤖 AI Summary
This work addresses the challenge that complex queries often admit multiple valid answers, which existing retrieval methods struggle to comprehensively cover. To this end, the authors propose the RVR framework, which employs an iterative closed-loop mechanism of “retrieve–verify–retrieve” to dynamically enrich the original query with verified documents, thereby enabling effective query expansion and answer discovery—without relying on complex agent-based systems. The approach requires only off-the-shelf or minimally fine-tuned retrievers and verifiers, yet achieves substantial gains in multi-answer recall. On the QAMPARI dataset, it yields a relative improvement of over 10% (3% absolute) in full-recall performance and consistently outperforms strong baselines across diverse domains, including QUEST and WebQuestionsSP.
📝 Abstract
Comprehensively retrieving diverse documents is crucial to address queries that admit a wide range of valid answers. We introduce retrieve-verify-retrieve (RVR), a multi-round retrieval framework designed to maximize answer coverage. Initially, a retriever takes the original query and returns a candidate document set, followed by a verifier that identifies a high-quality subset. For subsequent rounds, the query is augmented with previously verified documents to uncover answers that are not yet covered in previous rounds. RVR is effective even with off-the-shelf retrievers, and fine-tuning retrievers for our inference procedure brings further gains. Our method outperforms baselines, including agentic search approaches, achieving at least 10% relative and 3% absolute gain in complete recall percentage on a multi-answer retrieval dataset (QAMPARI). We also see consistent gains on two out-of-domain datasets (QUEST and WebQuestionsSP) across different base retrievers. Our work presents a promising iterative approach for comprehensive answer recall leveraging a verifier and adapting retrievers to a new inference scenario.