Decomposed Reasoning with Reinforcement Learning for Relevance Assessment in UGC Platforms

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

In UGC platforms, RAG systems face challenges in accurately assessing query-document relevance due to sparse user feedback—leading to ambiguous user intent—and high noise in informal, unstructured text. Method: We propose a decomposition-based reasoning framework grounded in reinforcement learning. It decouples relevance judgment into two subtasks: implicit query intent inference and verbatim snippet extraction, jointly optimized via a tailored reward mechanism that leverages sparse feedback. Crucially, we introduce top-ranked documents as weak supervision signals to enhance discriminative robustness under noisy conditions. Contribution/Results: Evaluated on multiple offline benchmarks and real-world UGC platform online A/B tests, our method consistently outperforms state-of-the-art baselines, achieving an average 12.7% improvement in relevance assessment accuracy. The decomposition strategy and weakly supervised reward design significantly improve generalization and reliability in low-signal, high-noise retrieval scenarios.

Technology Category

Application Category

📝 Abstract

Retrieval-augmented generation (RAG) plays a critical role in user-generated content (UGC) platforms, but its effectiveness depends heavily on accurate relevance assessment of query-document pairs. Despite recent advances in applying large language models (LLMs) to relevance modeling, UGC platforms present unique challenges: 1) ambiguous user intent due to sparse user feedback in RAG scenarios, and 2) substantial noise introduced by informal and unstructured language. To address these issues, we propose the Reinforced Reasoning Model for Relevance Assessment (R3A), which introduces a decomposed reasoning framework over queries and candidate documents before scoring. R3A first leverages auxiliary high-ranked documents within the platform to infer latent query intent. It then performs verbatim fragment extraction to justify relevance decisions, thereby reducing errors caused by noisy UGC. Based on a reinforcement learning framework, R3A is optimized to mitigate distortions arising from ambiguous queries and unstructured content. Experimental results show that R3A significantly outperforms existing baseline methods in terms of relevance accuracy, across both offline benchmarks and online experiments.

Problem

Research questions and friction points this paper is trying to address.

Improving relevance assessment in UGC platforms

Addressing ambiguous user intent in RAG scenarios

Reducing noise from unstructured language in UGC

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposed reasoning framework for query-document pairs

Verbatim fragment extraction to reduce noise

Reinforcement learning optimizes ambiguous query handling

🔎 Similar Papers

No similar papers found.