🤖 AI Summary
This work addresses the limitations of current large language model (LLM)-based rerankers, which suffer from high computational overhead and context-length constraints, while conventional truncation strategies rely on static heuristics that lack dynamic awareness of query relevance. The authors propose a novel approach that leverages an LLM to generate a semantic reference document, serving as a dynamic boundary between relevant and non-relevant documents to guide list truncation. This is combined with either non-overlapping or adaptively stepped overlapping window mechanisms to enable efficient list-wise reranking. Notably, this is the first method to employ LLM-generated reference documents for dynamic truncation, overcoming the constraints of fixed hyperparameters and topic-agnostic heuristics. Evaluated on the TREC Deep Learning benchmark, the approach significantly outperforms existing truncation strategies, achieving up to 66% speedup in both in-domain and out-of-domain settings.
📝 Abstract
Large Language Models (LLM) have been widely used in reranking. Computational overhead and large context lengths remain a challenging issue for LLM rerankers. Efficient reranking usually involves selecting a subset of the ranked list from the first stage, known as ranked list truncation (RLT). The truncated list is processed further by a reranker. For LLM rerankers, the ranked list is often partitioned and processed sequentially in batches to reduce the context length. Both these steps involve hyperparameters and topic-agnostic heuristics. Recently, LLMs have been shown to be effective for relevance judgment. Equivalently, we propose that LLMs can be used to generate reference documents that can act as a pivot between relevant and non-relevant documents in a ranked list. We propose methods to use these generated reference documents for RLT as well as for efficient listwise reranking. While reranking, we process the ranked list in either parallel batches of non-overlapping windows or overlapping windows with adaptive strides, improving the existing fixed stride setup. The generated reference documents are also shown to improve existing efficient listwise reranking frameworks. Experiments on TREC Deep Learning benchmarks show that our approach outperforms existing RLT-based approaches. In-domain and out-of-domain benchmarks demonstrate that our proposed methods accelerate LLM-based listwise reranking by up to 66\% compared to existing approaches. This work not only establishes a practical paradigm for efficient LLM-based reranking but also provides insight into the capability of LLMs to generate semantically controlled documents using relevance signals.