Leveraging Reference Documents for Zero-Shot Ranking via Large Language Models

📅 2025-06-13

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Pointwise methods for LLM-based text ranking suffer from high bias, while pairwise approaches incur prohibitive computational overhead (O(n²)). To resolve this trade-off, this paper proposes RefRank—a zero-shot prompt-based reference-guided ranking framework. Its core innovation is the introduction of a fixed, semantically anchored reference document, enabling indirect pairwise comparisons between each candidate and the reference—thereby achieving linear time complexity (O(n)). Furthermore, RefRank incorporates a shared-reference mechanism and a multi-reference weighted aggregation strategy to enhance robustness and generalization. Extensive experiments across multiple benchmark datasets and diverse LLMs demonstrate that RefRank significantly outperforms pointwise baselines and matches the ranking accuracy of pairwise methods, while reducing inference cost by an order of magnitude.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have demonstrated exceptional performance in the task of text ranking for information retrieval. While Pointwise ranking approaches offer computational efficiency by scoring documents independently, they often yield biased relevance estimates due to the lack of inter-document comparisons. In contrast, Pairwise methods improve ranking accuracy by explicitly comparing document pairs, but suffer from substantial computational overhead with quadratic complexity ($O(n^2)$). To address this tradeoff, we propose extbf{RefRank}, a simple and effective comparative ranking method based on a fixed reference document. Instead of comparing all document pairs, RefRank prompts the LLM to evaluate each candidate relative to a shared reference anchor. By selecting the reference anchor that encapsulates the core query intent, RefRank implicitly captures relevance cues, enabling indirect comparison between documents via this common anchor. This reduces computational cost to linear time ($O(n)$) while importantly, preserving the advantages of comparative evaluation. To further enhance robustness, we aggregate multiple RefRank outputs using a weighted averaging scheme across different reference choices. Experiments on several benchmark datasets and with various LLMs show that RefRank significantly outperforms Pointwise baselines and could achieve performance at least on par with Pairwise approaches with a significantly lower computational cost.

Problem

Research questions and friction points this paper is trying to address.

Balancing ranking accuracy and computational efficiency in LLMs

Reducing quadratic complexity in pairwise document comparisons

Improving zero-shot ranking via fixed reference documents

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses fixed reference document for ranking

Reduces computational cost to linear time

Aggregates multiple outputs for robustness

🔎 Similar Papers

Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers