π€ AI Summary
Existing sequential recommendation methods struggle to simultaneously model the dynamic temporal dependencies and high-order structural relationships inherent in user interactions. This work proposes a unified framework that integrates Transformers and graph neural networks, leveraging a cross-representation alignment mechanism to enable knowledge transfer between sequential and graph-structured data. By jointly capturing temporal dynamics and structural dependencies, the approach overcomes the limitations of single-modality modeling paradigms. Extensive experiments on multiple public benchmarks demonstrate that the proposed method significantly outperforms purely sequential models, purely graph-based models, and existing hybrid approaches, achieving state-of-the-art performance in next-item prediction accuracy.
π Abstract
Transformer architectures, capable of capturing sequential dependencies in the history of user interactions, have become the dominant approach in sequential recommender systems. Despite their success, such models consider sequence elements in isolation, implicitly accounting for the complex relationships between them. Graph neural networks, in contrast, explicitly model these relationships through higher order interactions but are often unable to adequately capture their evolution over time, limiting their use for predicting the next interaction. To fill this gap, we present a new framework that combines transformers and graph neural networks and aligns different representations for solving next-item prediction task. Our solution simultaneously encodes structural dependencies in the interaction graph and tracks their dynamic change. Experimental results on a number of open datasets demonstrate that the proposed framework consistently outperforms both pure sequential and graph approaches in terms of recommendation quality, as well as recent methods that combine both types of signals.