Pseudo-Relevance Feedback Can Improve Zero-Shot LLM-Based Dense Retrieval

📅 2025-03-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the longstanding trade-off between effectiveness and efficiency in zero-shot, LLM-driven dense retrieval. We introduce pseudo-relevance feedback (PRF) into this paradigm for the first time. Our method leverages a lightweight LLM to automatically extract salient semantic features—such as keywords and abstractive summaries—from top-ranked initial retrieval documents, then injects them into the PromptReps query representation framework to achieve query expansion. Crucially, our approach requires no fine-tuning or additional annotations; performance gains stem solely from PRF. Experiments across multiple paragraph-level retrieval benchmarks demonstrate substantial improvements in recall and mean reciprocal rank (MRR). Notably, in the re-ranking stage, a small-scale re-ranker augmented with PRF matches or even surpasses the performance of larger, unenhanced baseline models—achieving a new balance between retrieval effectiveness and computational efficiency.

Technology Category

Application Category

📝 Abstract
Pseudo-relevance feedback (PRF) refines queries by leveraging initially retrieved documents to improve retrieval effectiveness. In this paper, we investigate how large language models (LLMs) can facilitate PRF for zero-shot LLM-based dense retrieval, extending the recently proposed PromptReps method. Specifically, our approach uses LLMs to extract salient passage features-such as keywords and summaries-from top-ranked documents, which are then integrated into PromptReps to produce enhanced query representations. Experiments on passage retrieval benchmarks demonstrate that incorporating PRF significantly boosts retrieval performance. Notably, smaller rankers with PRF can match the effectiveness of larger rankers without PRF, highlighting PRF's potential to improve LLM-driven search while maintaining an efficient balance between effectiveness and resource usage.
Problem

Research questions and friction points this paper is trying to address.

Improves zero-shot LLM-based dense retrieval using pseudo-relevance feedback.
Leverages LLMs to extract features from top-ranked documents for query refinement.
Enhances retrieval performance while balancing effectiveness and resource efficiency.
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs extract features from top documents
PRF enhances query representations in PromptReps
Small rankers with PRF match larger rankers
🔎 Similar Papers
No similar papers found.
H
Hang Li
The University of Queensland, Brisbane, Australia
X
Xiao Wang
University of International Business and Economics, Beijing, China
Bevan Koopman
Bevan Koopman
CSIRO / University of Queensland
Information RetrievalSemantic SearchHealth Informatics
Guido Zuccon
Guido Zuccon
Professor, University of Queensland; Google Research Australia
Information Retrieval