Can QPP Choose the Right Query Variant? Evaluating Query Variant Selection for RAG Pipelines

📅 2026-04-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

185K/year
🤖 AI Summary
This work investigates how to efficiently select the optimal query variant from multiple semantically equivalent reformulations of the same information need within a retrieval-augmented generation (RAG) pipeline to enhance end-to-end generation quality. To this end, it introduces query performance prediction (QPP) for the first time to the task of in-topic query variant selection, systematically comparing pre-retrieval and post-retrieval QPP approaches. The study conducts large-scale evaluations on the TREC-RAG dataset using both sparse and dense retrievers, revealing a “utility gap” between retrieval metrics and downstream generation quality. It demonstrates that lightweight pre-retrieval predictors can effectively identify variants that outperform the original query, often matching or even surpassing more complex post-retrieval methods while significantly reducing latency without compromising generation quality.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have made query reformulation ubiquitous in modern retrieval and Retrieval-Augmented Generation (RAG) pipelines, enabling the generation of multiple semantically equivalent query variants. However, executing the full pipeline for every reformulation is computationally expensive, motivating selective execution: can we identify the best query variant before incurring downstream retrieval and generation costs? We investigate Query Performance Prediction (QPP) as a mechanism for variant selection across ad-hoc retrieval and end-to-end RAG. Unlike traditional QPP, which estimates query difficulty across topics, we study intra-topic discrimination - selecting the optimal reformulation among competing variants of the same information need. Through large-scale experiments on TREC-RAG using both sparse and dense retrievers, we evaluate pre- and post-retrieval predictors under correlation- and decision-based metrics. Our results reveal a systematic divergence between retrieval and generation objectives: variants that maximize ranking metrics such as nDCG often fail to produce the best generated answers, exposing a "utility gap" between retrieval relevance and generation fidelity. Nevertheless, QPP can reliably identify variants that improve end-to-end quality over the original query. Notably, lightweight pre-retrieval predictors frequently match or outperform more expensive post-retrieval methods, offering a latency-efficient approach to robust RAG.
Problem

Research questions and friction points this paper is trying to address.

Query Performance Prediction
Query Reformulation
Retrieval-Augmented Generation
Query Variant Selection
RAG
Innovation

Methods, ideas, or system contributions that make the work stand out.

Query Performance Prediction
Query Reformulation
Retrieval-Augmented Generation
Intra-topic Discrimination
Pre-retrieval Prediction
🔎 Similar Papers
No similar papers found.