When More Reformulations Hurt: Avoiding Drift using Ranker Feedback

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This work addresses the challenge that while extensive query reformulations in retrieval systems can improve recall, they often induce query drift and incur high reranking costs, making efficient utilization under limited inference budgets difficult. The authors propose ReformIR, a framework that treats query reformulation as a first-class feature and jointly optimizes reformulation selection and document filtering within a fixed reranking budget. By leveraging a strong neural reranker as a teacher model to provide online relevance estimates, ReformIR trains a lightweight proxy model to adaptively select high-value reformulations and documents, effectively mitigating drift while enhancing recall. Experiments demonstrate that ReformIR significantly outperforms existing methods on MS MARCO and TREC DL19–22 benchmarks, maintaining performance gains even as the number of reformulations increases, thereby validating the efficacy of feedback-driven reformulation optimization.

📝 Abstract

Modern retrieval pipelines increasingly rely on query reformulation and neural reranking to improve effectiveness, but this comes at a significant computational cost and introduces a fundamental tradeoff between recall and query drift. Generating many reformulated queries can substantially increase recall, yet naively merging or exhaustively reranking their results is prohibitively expensive. In this work, we argue that the core challenge is not reformulation generation itself, but the adaptive selection of reformulations and their retrieved documents under a strict inference budget. We propose ReformIR, a budget-aware retrieval framework that treats query reformulations as first-class features and performs online relevance estimation using a strong reranker as a teacher. Given multiple reformulated queries, ReformIR constructs a large candidate pool and learns a lightweight surrogate model that estimates document utility from reformulation-specific retrieval signals. Under a fixed reranking budget, the surrogate adaptively prioritizes both reformulations and documents, selectively querying a teacher reranker anchored to the original query. This process increases recall while actively suppressing drift through online feature selection over reformulations. We conduct extensive experiments on the MSMARCO passage corpora and TREC Deep Learning benchmarks (DL19-DL22). Our results show that ReformIR consistently outperforms existing reformulation strategies, particularly as the number of reformulations increases, where prior methods suffer from severe quality degradation due to drift. Our findings also suggest a shift in retrieval system design, rather than using large language models as rerankers, their capacity is more effectively leveraged in the reformulation stage with feedback-driven optimization.

Problem

Research questions and friction points this paper is trying to address.

query reformulation

query drift

retrieval

reranking

inference budget

Innovation

Methods, ideas, or system contributions that make the work stand out.

query reformulation

query drift

budget-aware retrieval