Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-Ranking

📅 2024-05-13
📈 Citations: 5
Influential: 1
📄 PDF
🤖 AI Summary
Lightweight cross-encoders underperform large language model (LLM) teachers in paragraph re-ranking, limiting their deployment despite computational advantages. Method: This paper proposes Rank-DistiLLM, a knowledge distillation framework that jointly incorporates hard negative sampling, deep negative sampling, and listwise loss into cross-encoder distillation. It constructs a high-quality weakly supervised dataset by leveraging LLMs to generate fine-grained ranking signals—replacing costly human annotations. Contribution/Results: Rank-DistiLLM preserves the lightweight architecture of cross-encoders while substantially improving ranking quality. Experiments across multiple standard benchmarks show that distilled models match the re-ranking performance of their LLM teachers, achieve up to 173× faster inference speed, and reduce GPU memory consumption by 24×, effectively bridging the efficiency–effectiveness gap in practical retrieval systems.

Technology Category

Application Category

📝 Abstract
Cross-encoders distilled from large language models (LLMs) are often more effective re-rankers than cross-encoders fine-tuned on manually labeled data. However, distilled models do not match the effectiveness of their teacher LLMs. We hypothesize that this effectiveness gap is due to the fact that previous work has not applied the best-suited methods for fine-tuning cross-encoders on manually labeled data (e.g., hard-negative sampling, deep sampling, and listwise loss functions). To close this gap, we create a new dataset, Rank-DistiLLM. Cross-encoders trained on Rank-DistiLLM achieve the effectiveness of LLMs while being up to 173 times faster and 24 times more memory efficient. Our code and data is available at https://github.com/webis-de/ECIR-25.
Problem

Research questions and friction points this paper is trying to address.

Bridging effectiveness gap between cross-encoders and LLMs
Improving passage re-ranking with optimized fine-tuning methods
Enhancing speed and memory efficiency of re-ranking models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distilled cross-encoders outperform manual fine-tuning
Rank-DistiLLM dataset bridges LLM effectiveness gap
173x faster and 24x more memory efficient
🔎 Similar Papers
No similar papers found.