🤖 AI Summary
Pairwise Ranking Prompting (PRP) achieves strong zero-shot document ranking performance but suffers from quadratic computational complexity—O(n²)—due to exhaustive pairwise enumeration, hindering practical deployment. To address this, we propose a pairwise distillation framework that transfers ranking knowledge from a large language model (LLM) trained on pairwise comparisons into a lightweight pointwise ranking model, enabling efficient pairwise-to-pointwise conversion. Our key innovation is a sample-efficient distillation strategy: by sampling only 2% of all possible document pairs, the distilled model attains ranking performance comparable to full-pairwise distillation. This drastically reduces both training and inference overhead while preserving zero-shot generalization capability. Empirical results demonstrate that our approach enables high-accuracy, scalable ranking in real-world applications without requiring task-specific fine-tuning or labeled data.
📝 Abstract
While Pairwise Ranking Prompting (PRP) with Large Language Models (LLMs) is one of the most effective zero-shot document ranking methods, it has a quadratic computational complexity with respect to the number of documents to be ranked, as it requires an enumeration over all possible document pairs. Consequently, the outstanding ranking performance of PRP has remained unreachable for most real-world ranking applications.
In this work, we propose to harness the effectiveness of PRP through pairwise distillation. Specifically, we distill a pointwise student ranker from pairwise teacher labels generated by PRP, resulting in an efficient student model that retains the performance of PRP with substantially lower computational costs. Furthermore, we find that the distillation process can be made sample-efficient: with only 2% of pairs, we are able to obtain the same performance as using all pairs for teacher labels. Thus, our novel approach provides a solution to harness the ranking performance of PRP without incurring high computational costs during both distillation and serving.