🤖 AI Summary
Current generative recommendation models (e.g., OneRec) lack explicit, controllable reasoning capabilities, hindering explainable decision-making. To address this, we propose a unified generative recommendation framework integrating dialogue understanding, structured reasoning, and personalized modeling. First, we design Itemic Alignment to achieve cross-modal semantic alignment. Second, we introduce Reasoning Scaffolding—a mechanism that explicitly activates the reasoning capacity of large language models (LLMs). Third, we construct a recommendation-specific reward function tailored to multiple valid-answer preferences, coupled with Think-Ahead reasoning deployment and dialogue-context-aware personalized preference modeling. Evaluated on public benchmarks, our method achieves state-of-the-art performance. Deployed in Kuaishou’s industrial recommender system, it increases average user session duration by 0.159%. This demonstrates that explicit textual reasoning simultaneously enhances both recommendation interpretability and effectiveness.
📝 Abstract
The powerful generative capacity of Large Language Models (LLMs) has instigated a paradigm shift in recommendation. However, existing generative models (e.g., OneRec) operate as implicit predictors, critically lacking the capacity for explicit and controllable reasoning-a key advantage of LLMs. To bridge this gap, we propose OneRec-Think, a unified framework that seamlessly integrates dialogue, reasoning, and personalized recommendation. OneRec-Think incorporates: (1) Itemic Alignment: cross-modal Item-Textual Alignment for semantic grounding; (2) Reasoning Activation: Reasoning Scaffolding to activate LLM reasoning within the recommendation context; and (3) Reasoning Enhancement, where we design a recommendation-specific reward function that accounts for the multi-validity nature of user preferences. Experiments across public benchmarks show state-of-the-art performance. Moreover, our proposed "Think-Ahead" architecture enables effective industrial deployment on Kuaishou, achieving a 0.159% gain in APP Stay Time and validating the practical efficacy of the model's explicit reasoning capability.