🤖 AI Summary
This work addresses the challenge of deploying Transformer-based sequential recommendation models with multiple user behaviors in large-scale e-commerce systems, where high computational complexity hinders efficiency. To this end, we propose the Transition-Aware Graph Attention Network (TGA), which models transitions among diverse user behaviors through a structured sparse graph, achieving linear time complexity while maintaining high accuracy. TGA innovatively identifies critical behavior transitions from three perspectives—item, category, and neighborhood—and introduces a novel graph attention mechanism that jointly incorporates behavior types and user–item interactions. Extensive experiments demonstrate that TGA outperforms state-of-the-art methods across multiple metrics, substantially reduces computational overhead, and has been successfully deployed in a large industrial system, yielding significant improvements in key business metrics.
📝 Abstract
User interactions on e-commerce platforms are inherently diverse, involving behaviors such as clicking, favoriting, adding to cart, and purchasing. The transitions between these behaviors offer valuable insights into user-item interactions, serving as a key signal for understanding evolving preferences. Consequently, there is growing interest in leveraging multi-behavior data to better capture user intent. Recent studies have explored sequential modeling of multi-behavior data, many relying on transformer-based architectures with polynomial time complexity. While effective, these approaches often incur high computational costs, limiting their applicability in large-scale industrial systems with long user sequences. To address this challenge, we propose the Transition-Aware Graph Attention Network (TGA), a linear-complexity approach for modeling multi-behavior transitions. Unlike traditional transformers that treat all behavior pairs equally, TGA constructs a structured sparse graph by identifying informative transitions from three perspectives: (a) item-level transitions, (b) category-level transitions, and (c) neighbor-level transitions. Built upon the structured graph, TGA employs a transition-aware graph Attention mechanism that jointly models user-item interactions and behavior transition types, enabling more accurate capture of sequential patterns while maintaining computational efficiency. Experiments show that TGA outperforms all state-of-the-art models while significantly reducing computational cost. Notably, TGA has been deployed in a large-scale industrial production environment, where it leads to impressive improvements in key business metrics.