Rich-Media Re-Ranker: A User Satisfaction-Driven LLM Re-ranking Framework for Rich-Media Search

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing re-ranking methods struggle to model users’ multidimensional intents and often neglect rich multimodal signals such as visual information, thereby limiting improvements in search satisfaction. To address this, this work proposes a user satisfaction–oriented re-ranking framework powered by large language models (LLMs). The approach employs a Query Planner to parse query evolution and disentangle user intents, integrates visual signals generated by a vision-language model (VLM), and leverages a multi-task reinforcement learning–enhanced LLM re-ranker to enable fine-grained, multidimensional relevance assessment. Extensive offline experiments demonstrate significant performance gains over state-of-the-art baselines, and the method has been successfully deployed in a large-scale industrial search system, yielding measurable improvements in online user engagement and satisfaction.

Technology Category

Application Category

📝 Abstract
Re-ranking plays a crucial role in modern information search systems by refining the ranking of initial search results to better satisfy user information needs. However, existing methods show two notable limitations in improving user search satisfaction: inadequate modeling of multifaceted user intents and neglect of rich side information such as visual perception signals. To address these challenges, we propose the Rich-Media Re-Ranker framework, which aims to enhance user search satisfaction through multi-dimensional and fine-grained modeling. Our approach begins with a Query Planner that analyzes the sequence of query refinements within a session to capture genuine search intents, decomposing the query into clear and complementary sub-queries to enable broader coverage of users'potential intents. Subsequently, moving beyond primary text content, we integrate richer side information of candidate results, including signals modeling visual content generated by the VLM-based evaluator. These comprehensive signals are then processed alongside carefully designed re-ranking principle that considers multiple facets, including content relevance and quality, information gain, information novelty, and the visual presentation of cover images. Then, the LLM-based re-ranker performs the holistic evaluation based on these principles and integrated signals. To enhance the scenario adaptability of the VLM-based evaluator and the LLM-based re-ranker, we further enhance their capabilities through multi-task reinforcement learning. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art baselines. Notably, the proposed framework has been deployed in a large-scale industrial search system, yielding substantial improvements in online user engagement rates and satisfaction metrics.
Problem

Research questions and friction points this paper is trying to address.

re-ranking
user satisfaction
rich-media search
user intent
visual perception
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rich-Media Re-Ranking
User Intent Modeling
Visual-Language Model (VLM)
LLM-based Re-ranker
Multi-task Reinforcement Learning
🔎 Similar Papers
No similar papers found.
Z
Zihao Guo
SCSE, Beihang University
L
Ligang Zhou
Baidu Inc
Z
Zeyang Tang
Baidu Inc
F
Feicheng Li
Baidu Inc
Y
Ying Nie
Baidu Inc
Z
Zhiming Peng
Baidu Inc
Qingyun Sun
Qingyun Sun
Assistant Professor, Beihang University
Data MiningGraph Machine LearningDeep Learning
Jianxin Li
Jianxin Li
School of Computer Science & Engineering, Beihang University
Big DataAIIntelligent Computing