π€ AI Summary
This work addresses the challenge that existing sequential recommendation systems struggle to capture deep user semantics, while directly integrating large language models (LLMs) incurs prohibitive online inference costs. To overcome this, the authors propose an efficient knowledge distillation approach that leverages a pretrained LLM offline to generate textual user profiles and injects their semantic knowledge into a lightweight sequential recommender via user-centric knowledge distillation. Notably, this method requires neither online LLM invocation, architectural modifications to the base recommender, nor fine-tuning of the LLM. By preserving the original inference efficiency, it substantially enhances the modelβs semantic understanding of user behavior and overall recommendation performance, thereby achieving, for the first time, an efficient, fine-tuning-free, and architecture-agnostic semantic-enhanced sequential recommendation framework.
π Abstract
Sequential recommender systems have achieved significant success in modeling temporal user behavior but remain limited in capturing rich user semantics beyond interaction patterns. Large Language Models (LLMs) present opportunities to enhance user understanding with their reasoning capabilities, yet existing integration approaches create prohibitive inference costs in real time. To address these limitations, we present a novel knowledge distillation method that utilizes textual user profile generated by pre-trained LLMs into sequential recommenders without requiring LLM inference at serving time. The resulting approach maintains the inference efficiency of traditional sequential models while requiring neither architectural modifications nor LLM fine-tuning.