LANCER: LLM Reranking for Nugget Coverage

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of existing retrieval methods, which prioritize relevance ranking but often fail to ensure sufficient information coverage required for long-form text generation. To overcome this, the authors propose a coverage-oriented retrieval reranking approach that leverages large language models (LLMs) to generate sub-questions reflecting key information needs. By identifying which information points each retrieved document covers and optimizing the ranking to maximize overall coverage, the method integrates sub-question generation, document–information point matching, and coverage-aware reranking. Evaluated using metrics such as α-nDCG, the approach significantly outperforms existing LLM-based reranking methods in terms of information coverage, demonstrating the critical role of sub-question generation in enhancing the comprehensiveness of retrieved results.

Technology Category

Application Category

📝 Abstract
Unlike short-form retrieval-augmented generation (RAG), such as factoid question answering, long-form RAG requires retrieval to provide documents covering a wide range of relevant information. Automated report generation exemplifies this setting: it requires not only relevant information but also a more elaborate response with comprehensive information. Yet, existing retrieval methods are primarily optimized for relevance ranking rather than information coverage. To address this limitation, we propose LANCER, an LLM-based reranking method for nugget coverage. LANCER predicts what sub-questions should be answered to satisfy an information need, predicts which documents answer these sub-questions, and reranks documents in order to provide a ranked list covering as many information nuggets as possible. Our empirical results show that LANCER enhances the quality of retrieval as measured by nugget coverage metrics, achieving higher $\alpha$-nDCG and information coverage than other LLM-based reranking methods. Our oracle analysis further reveals that sub-question generation plays an essential role.
Problem

Research questions and friction points this paper is trying to address.

long-form RAG
information coverage
nugget coverage
retrieval
LLM reranking
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM reranking
nugget coverage
long-form RAG
sub-question generation
information retrieval
🔎 Similar Papers