🤖 AI Summary
This work addresses the high computational cost of generative recommender systems when processing long user behavior sequences, which hinders real-time deployment. To mitigate this challenge, the authors propose a lightweight sequence compression method that leverages item category features to significantly reduce sequence length while preserving essential user interest information. The compressed sequences are then integrated into generative recommendation models to enable efficient sequential modeling. Experimental results on two large-scale datasets demonstrate that the proposed approach reduces computational overhead by up to sixfold compared to the HSTU model, while simultaneously improving recommendation accuracy by as much as 39% under identical sequence length constraints, thereby achieving a favorable balance between efficiency and performance.
📝 Abstract
Although generative recommenders demonstrate improved performance with longer sequences, their real-time deployment is hindered by substantial computational costs. To address this challenge, we propose a simple yet effective method for compressing long-term user histories by leveraging inherent item categorical features, thereby preserving user interests while enhancing efficiency. Experiments on two large-scale datasets demonstrate that, compared to the influential HSTU model, our approach achieves up to a 6x reduction in computational cost and up to 39% higher accuracy at comparable cost (i.e., similar sequence length).