Short Video Segment-level User Dynamic Interests Modeling in Personalized Recommendation

📅 2025-04-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In short-video recommendation, existing methods treat videos as holistic units, failing to capture users’ dynamic interest evolution at the segment level. To address this, we propose the first segment-granularity paradigm for modeling user interest dynamics. Our approach introduces an end-to-end framework integrating multimodal encoding with self-supervised segment-interest decoding, comprising a hybrid representation module, a multimodal user–video encoder, a cross-modal alignment mechanism, and a temporal attention module. We further construct and publicly release the first short-video dataset featuring fine-grained segment-level annotations and multi-behavior labels (e.g., skip, watch, like). Extensive experiments demonstrate significant improvements over state-of-the-art methods on both video skip prediction and recommendation tasks. Empirical analysis confirms that segment-level modeling enhances recommendation accuracy and average user session duration, thereby strengthening personalization and engagement.

Technology Category

Application Category

📝 Abstract
The rapid growth of short videos has necessitated effective recommender systems to match users with content tailored to their evolving preferences. Current video recommendation models primarily treat each video as a whole, overlooking the dynamic nature of user preferences with specific video segments. In contrast, our research focuses on segment-level user interest modeling, which is crucial for understanding how users' preferences evolve during video browsing. To capture users' dynamic segment interests, we propose an innovative model that integrates a hybrid representation module, a multi-modal user-video encoder, and a segment interest decoder. Our model addresses the challenges of capturing dynamic interest patterns, missing segment-level labels, and fusing different modalities, achieving precise segment-level interest prediction. We present two downstream tasks to evaluate the effectiveness of our segment interest modeling approach: video-skip prediction and short video recommendation. Our experiments on real-world short video datasets with diverse modalities show promising results on both tasks. It demonstrates that segment-level interest modeling brings a deep understanding of user engagement and enhances video recommendations. We also release a unique dataset that includes segment-level video data and diverse user behaviors, enabling further research in segment-level interest modeling. This work pioneers a novel perspective on understanding user segment-level preference, offering the potential for more personalized and engaging short video experiences.
Problem

Research questions and friction points this paper is trying to address.

Modeling dynamic user interests at short video segment level
Addressing missing segment-level labels in recommendation systems
Fusing multi-modal data for precise interest prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Segment-level user interest modeling
Hybrid representation module integration
Multi-modal user-video encoder
🔎 Similar Papers
No similar papers found.