🤖 AI Summary
To address severe noise interference and insufficient side-information integration in sparse-interaction and cold-start scenarios for side-information-enhanced sequential recommendation (SISR), this paper proposes a frequency-domain-aware two-stage multi-sequence fusion framework. It innovatively applies Fast Fourier Transform (FFT) to denoise user behavioral sequences in the frequency domain; designs an ID-attribute dual-stream encoder with dual-path attention to jointly model fine-grained temporal dependencies at early and intermediate stages; and constructs multi-granularity sequential representations to enhance side-information utilization. Extensive experiments on four benchmark datasets demonstrate significant improvements: Recall@20 and NDCG@20 increase by 14.1% and 12.5%, respectively, outperforming state-of-the-art SISR methods. These results validate the effectiveness and superiority of the proposed frequency-domain denoising and two-stage fusion strategy.
📝 Abstract
Side-information Integrated Sequential Recommendation (SISR) benefits from auxiliary item information to infer hidden user preferences, which is particularly effective for sparse interactions and cold-start scenarios. However, existing studies face two main challenges. (i) They fail to remove noisy signals in item sequence and (ii) they underutilize the potential of side-information integration. To tackle these issues, we propose a novel SISR model, Dual Side-Information Filtering and Fusion (DIFF), which employs frequency-based noise filtering and dual multi-sequence fusion. Specifically, we convert the item sequence to the frequency domain to filter out noisy short-term fluctuations in user interests. We then combine early and intermediate fusion to capture diverse relationships across item IDs and attributes. Thanks to our innovative filtering and fusion strategy, DIFF is more robust in learning subtle and complex item correlations in the sequence. DIFF outperforms state-of-the-art SISR models, achieving improvements of up to 14.1% and 12.5% in Recall@20 and NDCG@20 across four benchmark datasets.