A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation

📅 2024-03-20
🏛️ arXiv.org
📈 Citations: 5
Influential: 1
📄 PDF
🤖 AI Summary
Existing short-video recommendation systems neglect comment content and user-comment interaction behaviors, hindering joint modeling of user preferences over videos and comments. Method: This paper proposes the first video-comment collaborative recommendation framework: (1) a two-stage training paradigm comprising preference representation alignment distillation and recommendation-oriented fine-tuning; and (2) the novel integration of large language models (LLMs) as disposable semantic enhancers to augment the semantic understanding capability of sequential recommendation (SR) models. The method synergistically combines SR and LLMs via cross-modal preference alignment to jointly optimize video and comment recommendations. Results: A/B testing on the Kuaishou platform demonstrates a 4.13% increase in comment viewing duration. The proposed method significantly outperforms state-of-the-art baselines on dual-task recommendation—validating that explicit modeling of comment interactions yields substantial gains for personalized short-video recommendation.

Technology Category

Application Category

📝 Abstract
In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interaction histories with both videos and comments, so as to jointly conduct personalized video and comment recommendation. Specifically, our approach consists of two key components, namely sequential recommendation (SR) model and supplemental large language model (LLM) recommender. The SR model serves as the primary recommendation backbone (retained in deployment) of our approach, allowing for efficient user preference modeling. Meanwhile, we leverage the LLM recommender as a supplemental component (discarded in deployment) to better capture underlying user preferences from heterogeneous interaction behaviors. In order to integrate the merits of the SR model and the supplemental LLM recommender, we design a twostage training paradigm. The first stage is personalized preference alignment, which aims to align the preference representations from both components, thereby enhancing the semantics of the SR model. The second stage is recommendation-oriented fine-tuning, in which the alignment-enhanced SR model is fine-tuned according to specific objectives. Extensive experiments in both video and comment recommendation tasks demonstrate the effectiveness of LSVCR. Additionally, online A/B testing on the KuaiShou platform verifies the actual benefits brought by our approach. In particular, we achieve a significant overall gain of 4.13% in comment watch time.
Problem

Research questions and friction points this paper is trying to address.

Improving video and comment recommendation using interaction histories
Integrating sequential and large language models for preference modeling
Enhancing recommendation semantics via two-stage training alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes user interaction histories with videos and comments
Combines sequential recommendation model with large language model
Introduces two-stage training for preference alignment and fine-tuning
🔎 Similar Papers
No similar papers found.