Decomposed Reasoning with Reinforcement Learning for Relevance Assessment in UGC Platforms

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In UGC platforms, RAG systems face challenges in accurately assessing query-document relevance due to sparse user feedback—leading to ambiguous user intent—and high noise in informal, unstructured text. Method: We propose a decomposition-based reasoning framework grounded in reinforcement learning. It decouples relevance judgment into two subtasks: implicit query intent inference and verbatim snippet extraction, jointly optimized via a tailored reward mechanism that leverages sparse feedback. Crucially, we introduce top-ranked documents as weak supervision signals to enhance discriminative robustness under noisy conditions. Contribution/Results: Evaluated on multiple offline benchmarks and real-world UGC platform online A/B tests, our method consistently outperforms state-of-the-art baselines, achieving an average 12.7% improvement in relevance assessment accuracy. The decomposition strategy and weakly supervised reward design significantly improve generalization and reliability in low-signal, high-noise retrieval scenarios.

Technology Category

Application Category

📝 Abstract
Retrieval-augmented generation (RAG) plays a critical role in user-generated content (UGC) platforms, but its effectiveness depends heavily on accurate relevance assessment of query-document pairs. Despite recent advances in applying large language models (LLMs) to relevance modeling, UGC platforms present unique challenges: 1) ambiguous user intent due to sparse user feedback in RAG scenarios, and 2) substantial noise introduced by informal and unstructured language. To address these issues, we propose the Reinforced Reasoning Model for Relevance Assessment (R3A), which introduces a decomposed reasoning framework over queries and candidate documents before scoring. R3A first leverages auxiliary high-ranked documents within the platform to infer latent query intent. It then performs verbatim fragment extraction to justify relevance decisions, thereby reducing errors caused by noisy UGC. Based on a reinforcement learning framework, R3A is optimized to mitigate distortions arising from ambiguous queries and unstructured content. Experimental results show that R3A significantly outperforms existing baseline methods in terms of relevance accuracy, across both offline benchmarks and online experiments.
Problem

Research questions and friction points this paper is trying to address.

Improving relevance assessment in UGC platforms
Addressing ambiguous user intent in RAG scenarios
Reducing noise from unstructured language in UGC
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposed reasoning framework for query-document pairs
Verbatim fragment extraction to reduce noise
Reinforcement learning optimizes ambiguous query handling
🔎 Similar Papers
No similar papers found.
Xiaowei Yuan
Xiaowei Yuan
Institute of Automation; Chinese Academy of Sciences
L
Lei Jin
Xiaohongshu Inc.
H
Haoxin Zhang
Xiaohongshu Inc.
Y
Yan Gao
Xiaohongshu Inc.
Y
Yi Wu
Xiaohongshu Inc.
Yao Hu
Yao Hu
浙江大学
Machine Learning
Z
Ziyang Huang
The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences
J
Jun Zhao
The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences
K
Kang Liu
The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences