Counterfactual Query Rewriting to Use Historical Relevance Feedback

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

To address the degradation of retrieval performance for repeated queries in dynamic corpora—caused by historical relevant documents becoming obsolete (e.g., due to content modification or removal)—this paper proposes a counterfactual query rewriting method. Instead of relying on the continued availability of original documents, it leverages historical relevance feedback (e.g., clicks or annotations) as a signal source, reconstructing queries via lightweight key-query generation and term expansion. This is the first approach to enable counterfactual reuse of historical feedback, thereby circumventing risks associated with document obsolescence. It operates seamlessly with standard retrieval models (e.g., BM25) and requires no fine-tuning of large language models. Evaluated under the CLEF LongEval long-term evaluation framework, the method achieves a 4.2% improvement in nDCG@10 over baseline methods, significantly outperforming computationally intensive Transformer-based approaches while substantially reducing inference overhead.

Technology Category

Application Category

📝 Abstract

When a retrieval system receives a query it has encountered before, previous relevance feedback, such as clicks or explicit judgments can help to improve retrieval results. However, the content of a previously relevant document may have changed, or the document might not be available anymore. Despite this evolved corpus, we counterfactually use these previously relevant documents as relevance signals. In this paper we proposed approaches to rewrite user queries and compare them against a system that directly uses the previous qrels for the ranking. We expand queries with terms extracted from the previously relevant documents or derive so-called keyqueries that rank the previously relevant documents to the top of the current corpus. Our evaluation in the CLEF LongEval scenario shows that rewriting queries with historical relevance feedback improves the retrieval effectiveness and even outperforms computationally expensive transformer-based approaches.

Problem

Research questions and friction points this paper is trying to address.

Improve retrieval with historical feedback

Rewrite queries using past relevant documents

Enhance retrieval effectiveness over transformer-based methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Counterfactual query rewriting technique

Historical relevance feedback utilization

Keyqueries derivation for ranking

🔎 Similar Papers

No similar papers found.