M2Rec: Multi-scale Mamba for Efficient Sequential Recommendation

📅 2025-05-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing sequential recommendation methods suffer from two key limitations: (1) low efficiency due to the quadratic computational complexity of Transformer-based models, and (2) insufficient multi-scale modeling—Mamba struggles to capture behavioral periodicity, suffers from semantic sparsity in user–item interactions, and exhibits weak multimodal feature fusion. To address these challenges, we propose FreqMamba, the first Mamba-based architecture integrating Fast Fourier Transform (FFT) for explicit frequency-domain modeling of user behavioral periodicity. We further incorporate large language model (LLM)-derived text embeddings to mitigate interaction sparsity and design a learnable gating mechanism for adaptive fusion of temporal, spectral, and semantic features. Extensive experiments demonstrate that FreqMamba achieves a 3.2% absolute improvement in Hit Rate@10 over state-of-the-art Mamba baselines across multiple benchmark datasets, while attaining 1.2× faster inference speed than Transformers. FreqMamba establishes new state-of-the-art performance in sequential recommendation.

Technology Category

Application Category

📝 Abstract
Sequential recommendation systems aim to predict users' next preferences based on their interaction histories, but existing approaches face critical limitations in efficiency and multi-scale pattern recognition. While Transformer-based methods struggle with quadratic computational complexity, recent Mamba-based models improve efficiency but fail to capture periodic user behaviors, leverage rich semantic information, or effectively fuse multimodal features. To address these challenges, we propose model, a novel sequential recommendation framework that integrates multi-scale Mamba with Fourier analysis, Large Language Models (LLMs), and adaptive gating. First, we enhance Mamba with Fast Fourier Transform (FFT) to explicitly model periodic patterns in the frequency domain, separating meaningful trends from noise. Second, we incorporate LLM-based text embeddings to enrich sparse interaction data with semantic context from item descriptions. Finally, we introduce a learnable gate mechanism to dynamically balance temporal (Mamba), frequency (FFT), and semantic (LLM) features, ensuring harmonious multimodal fusion. Extensive experiments demonstrate that model achieves state-of-the-art performance, improving Hit Rate@10 by 3.2% over existing Mamba-based models while maintaining 20% faster inference than Transformer baselines. Our results highlight the effectiveness of combining frequency analysis, semantic understanding, and adaptive fusion for sequential recommendation. Code and datasets are available at: https://anonymous.4open.science/r/M2Rec.
Problem

Research questions and friction points this paper is trying to address.

Efficient sequential recommendation with multi-scale pattern recognition
Capturing periodic user behaviors via Fourier-enhanced Mamba
Dynamic fusion of temporal, frequency, and semantic features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-scale Mamba with Fourier analysis
LLM-based text embeddings enrichment
Learnable gate for multimodal fusion
🔎 Similar Papers
No similar papers found.
Q
Qianru Zhang
school of computing and data science, the University of Hong Kong
L
Liang Qu
School of Business and Law, Edith Cowan University
H
Honggang Wen
school of computing and data science, the University of Hong Kong
D
Dong Huang
school of computing and data science, the University of Hong Kong
S
S. Yiu
school of computing and data science, the University of Hong Kong
Nguyen Quoc Viet Hung
Nguyen Quoc Viet Hung
Associate Professor, Griffith University
Big DataIoTRecommender SystemData PrivacyGraph Data Analysis
Hongzhi Yin
Hongzhi Yin
Professor and ARC Future Fellow, University of Queensland
Recommender SystemGraph LearningSpatial-temporal PredictionEdge IntelligenceLLM