RAQG-QPP: Query Performance Prediction with Retrieved Query Variants and Retrieval Augmented Query Generation

📅 2026-04-29

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the limited effectiveness of existing unsupervised query performance prediction methods on neural rankers and the tendency of conventional query reformulation techniques to produce incoherent or off-topic variants. To overcome these limitations, the authors propose a novel approach that, for the first time, integrates historical query log retrieval with conditional generation using large language models (LLMs) to construct high-quality, semantically coherent query variants. This method moves beyond traditional term expansion paradigms and substantially improves the accuracy of unsupervised query performance prediction. Empirical results on the TREC Deep Learning Track 2019 and 2020 datasets demonstrate that the proposed technique achieves up to a 30% relative improvement in prediction accuracy over the best existing variant-based methods when applied to neural ranking models such as MonoT5.

📝 Abstract

Query Performance Prediction (QPP) estimates the retrieval quality of ranking models without the use of any human-assessed relevance judgements, and finds applications in query-specific selective decision making to improve overall retrieval effectiveness. Although unsupervised QPP approaches are effective for lexical retrieval models, they usually perform weaker for neural rankers. Recent work shows that leveraging query variants (QVs), i.e., queries with potentially similar information needs to a given query, can enhance unsupervised QPP accuracy. However, existing QV-based prediction methods rely on query variants generated by term expansion of the input query, which is likely to yield incoherent, hallucinatory and off-topic QVs. In this paper, we propose to make use of queries retrieved from a log of past queries as QVs to be subsequently used for QPP. In addition to directly applying retrieved QVs in QPP, we further propose to leverage large language models (LLMs) to generate QVs conditioned on the retrieved QVs, thus mitigating the limitation of relying only on existing queries in a log. Experiments on TREC DL'19 and DL'20 show that QPP enhanced with RAQG outperform the best-performing existing QV-based prediction approach by as much as 30% on neural ranking models such as MonoT5.

Problem

Research questions and friction points this paper is trying to address.

Query Performance Prediction

Query Variants

Neural Rankers

Unsupervised QPP

Retrieval Effectiveness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Query Performance Prediction

Retrieved Query Variants

Retrieval-Augmented Generation