🤖 AI Summary
Existing LLM-based recommender systems rely on intuitive (System-1) prompting, resulting in fragile reasoning paths and low fault tolerance. To address this, we propose ReThinkRec—a three-stage iterative framework comprising reasoning, reflection, and refinement—enabling System-2–style deep reasoning. First, a primary LLM generates recommendation rationales; second, a dedicated reflection model evaluates their logical consistency and provides structured feedback; third, multi-round generative refinement iteratively enhances rationale quality before integrating refined knowledge into the core recommendation network. ReThinkRec is the first to jointly embed interpretable reflection and generative refinement into the recommendation pipeline, significantly improving reasoning robustness and decision quality. Experiments on Amazon-Book and MovieLens-1M demonstrate substantial gains over strong baselines. Online A/B testing shows a 2.2% increase in advertising revenue, validating both effectiveness and industrial deployability.
📝 Abstract
Harnessing Large Language Models (LLMs) for recommendation systems has emerged as a prominent avenue, drawing substantial research interest. However, existing approaches primarily involve basic prompt techniques for knowledge acquisition, which resemble System-1 thinking. This makes these methods highly sensitive to errors in the reasoning path, where even a small mistake can lead to an incorrect inference. To this end, in this paper, we propose $R^{4}$ec, a reasoning, reflection and refinement framework that evolves the recommendation system into a weak System-2 model. Specifically, we introduce two models: an actor model that engages in reasoning, and a reflection model that judges these responses and provides valuable feedback. Then the actor model will refine its response based on the feedback, ultimately leading to improved responses. We employ an iterative reflection and refinement process, enabling LLMs to facilitate slow and deliberate System-2-like thinking. Ultimately, the final refined knowledge will be incorporated into a recommendation backbone for prediction. We conduct extensive experiments on Amazon-Book and MovieLens-1M datasets to demonstrate the superiority of $R^{4}$ec. We also deploy $R^{4}$ec on a large scale online advertising platform, showing 2.2% increase of revenue. Furthermore, we investigate the scaling properties of the actor model and reflection model.