🤖 AI Summary
Text embedding models offer efficiency but suffer from limited ranking accuracy, falling short of computationally intensive LLM-based re-rankers. To address this, we propose E²Rank—a novel framework that enables a single embedding model to jointly support retrieval and listwise re-ranking. E²Rank achieves this by continuously fine-tuning text embedding models on listwise ranking objectives, using cosine similarity as the unified scoring function. It further introduces a candidate-aware query augmentation mechanism that simulates pseudo-relevance feedback, explicitly modeling both query–document and document–document interactions. Crucially, E²Rank preserves low latency and high throughput while substantially improving re-ranking quality. Experiments demonstrate that E²Rank achieves state-of-the-art performance on the BEIR re-ranking benchmark, excels in the inference-heavy BRIGHT benchmark, and even enhances performance on the MTEB embedding benchmark—showcasing its dual utility for both retrieval and re-ranking tasks.
📝 Abstract
Text embedding models serve as a fundamental component in real-world search applications. By mapping queries and documents into a shared embedding space, they deliver competitive retrieval performance with high efficiency. However, their ranking fidelity remains limited compared to dedicated rerankers, especially recent LLM-based listwise rerankers, which capture fine-grained query-document and document-document interactions. In this paper, we propose a simple yet effective unified framework $ ext{E}^2 ext{Rank}$, means Efficient Embedding-based Ranking (also means Embedding-to-Rank), which extends a single text embedding model to perform both high-quality retrieval and listwise reranking through continued training under a listwise ranking objective, thereby achieving strong effectiveness with remarkable efficiency. By applying cosine similarity between the query and document embeddings as a unified ranking function, the listwise ranking prompt, which is constructed from the original query and its candidate documents, serves as an enhanced query enriched with signals from the top-K documents, akin to pseudo-relevance feedback (PRF) in traditional retrieval models. This design preserves the efficiency and representational quality of the base embedding model while significantly improving its reranking performance. Empirically, $ extrm{E}^2 ext{Rank}$ achieves state-of-the-art results on the BEIR reranking benchmark and demonstrates competitive performance on the reasoning-intensive BRIGHT benchmark, with very low reranking latency. We also show that the ranking training process improves embedding performance on the MTEB benchmark. Our findings indicate that a single embedding model can effectively unify retrieval and reranking, offering both computational efficiency and competitive ranking accuracy.