Generate the browsing process for short-video recommendation

📅 2025-04-02
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the problem of video segment-level watch time prediction in short-video recommendation. We propose SCAM, a Segment-level Content-Aware Model that eschews explicit multimodal video feature extraction. Instead, SCAM implicitly models segment-level semantics and dynamic user interest evolution using fine-grained engagement signals—such as dwell time and skip behavior—from historical interactions. Our key contributions include: (1) a novel behavior-driven content representation paradigm; (2) a transformer-like architecture with segment-wise temporal partitioning; and (3) duration-bias-aware sequential modeling, enabling accurate segment-level watch time prediction without explicit semantic understanding. Extensive experiments on both industrial and public benchmarks demonstrate that SCAM achieves state-of-the-art performance in watch time prediction, while significantly improving recommendation relevance and average user session duration.

Technology Category

Application Category

📝 Abstract
This paper introduces a new model to generate the browsing process for short-video recommendation and proposes a novel Segment Content Aware Model via User Engagement Feedback (SCAM) for watch time prediction in video recommendation. Unlike existing methods that rely on multimodal features for video content understanding, SCAM implicitly models video content through users' historical watching behavior, enabling segment-level understanding without complex multimodal data. By dividing videos into segments based on duration and employing a Transformer-like architecture, SCAM captures the sequential dependence between segments while mitigating duration bias. Extensive experiments on industrial-scale and public datasets demonstrate SCAM's state-of-the-art performance in watch time prediction. The proposed approach offers a scalable and effective solution for video recommendation by leveraging segment-level modeling and users' engagement feedback.
Problem

Research questions and friction points this paper is trying to address.

Simulate users' short video watching journey dynamically for watch time prediction
Model users' sustained interest using collaborative information and interaction behaviors
Capture sequential dependencies between video segments while mitigating duration bias
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative method simulates user video watching journey
Transformer architecture captures sequential dependencies between segments
Segment-level modeling uses user engagement feedback for prediction
🔎 Similar Papers
No similar papers found.