🤖 AI Summary
To address the lack of modularity in LLM-based re-ranking within multi-stage retrieval, poor API reliability, and non-determinism in Mixture-of-Experts (MoE) models, this paper introduces the first open-source Python toolkit specifically designed for re-ranking tasks. Its core is a modular re-ranking framework that integrates prompt analysis, response reliability diagnostics, and MoE behavior tracing, while enabling seamless coupling with Pyserini. The toolkit provides a unified abstraction for interfacing with diverse LLMs (10+ open- and closed-source), and embeds multi-granularity evaluation protocols. Experimental reproduction of state-of-the-art methods—including RankGPT, LRL, and RankVicuna—demonstrates consistent SOTA performance on BEIR and MSMARCO benchmarks. The toolkit significantly enhances configurability, robustness, and reproducibility of re-ranking systems.
📝 Abstract
The adoption of large language models (LLMs) as rerankers in multi-stage retrieval systems has gained significant traction in academia and industry. These models refine a candidate list of retrieved documents, often through carefully designed prompts, and are typically used in applications built on retrieval-augmented generation (RAG). This paper introduces RankLLM, an open-source Python package for reranking that is modular, highly configurable, and supports both proprietary and open-source LLMs in customized reranking workflows. To improve usability, RankLLM features optional integration with Pyserini for retrieval and provides integrated evaluation for multi-stage pipelines. Additionally, RankLLM includes a module for detailed analysis of input prompts and LLM responses, addressing reliability concerns with LLM APIs and non-deterministic behavior in Mixture-of-Experts (MoE) models. This paper presents the architecture of RankLLM, along with a detailed step-by-step guide and sample code. We reproduce results from RankGPT, LRL, RankVicuna, RankZephyr, and other recent models. RankLLM integrates with common inference frameworks and a wide range of LLMs. This compatibility allows for quick reproduction of reported results, helping to speed up both research and real-world applications. The complete repository is available at rankllm.ai, and the package can be installed via PyPI.