🤖 AI Summary
This work addresses the critical limitation of existing generative retrieval methods, which prioritize relevance while neglecting document authority—leading to the propagation of unreliable information in high-stakes domains such as healthcare and finance. To bridge this gap, we propose AuthGR, a novel authority-aware generative retrieval framework that, for the first time, integrates authority modeling into generative retrieval. AuthGR leverages vision-language models to fuse multimodal signals for authority scoring and employs a three-stage progressive training strategy alongside a hybrid ensemble inference mechanism. Experimental results demonstrate that a 3B-parameter AuthGR model matches the offline performance of a 14B baseline, while large-scale online A/B tests and human evaluations confirm its significant improvements in user engagement and result reliability.
📝 Abstract
Generative information retrieval (GenIR) formulates the retrieval process as a text-to-text generation task, leveraging the vast knowledge of large language models. However, existing works primarily optimize for relevance while often overlooking document trustworthiness. This is critical in high-stakes domains like healthcare and finance, where relying solely on semantic relevance risks retrieving unreliable information. To address this, we propose an Authority-aware Generative Retriever (AuthGR), the first framework that incorporates authority into GenIR. AuthGR consists of three key components: (i) Multimodal Authority Scoring, which employs a vision-language model to quantify authority from textual and visual cues; (ii) a Three-stage Training Pipeline to progressively instill authority awareness into the retriever; and (iii) a Hybrid Ensemble Pipeline for robust deployment. Offline evaluations demonstrate that AuthGR successfully enhances both authority and accuracy, with our 3B model matching a 14B baseline. Crucially, large-scale online A/B tests and human evaluations conducted on the commercial web search platform confirm significant improvements in real-world user engagement and reliability.