HiSAC: Hierarchical Sparse Activation Compression for Ultra-long Sequence Modeling in Recommenders

๐Ÿ“… 2026-02-24
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Modeling ultra-long user behavior sequences in recommender systems is hindered by latency and memory constraints, and existing interest-centric approaches struggle to capture fine-grained preferences while often neglecting long-tail interests. This work proposes a hierarchical sparse activation mechanism that constructs personalized interest proxies through multi-level semantic IDs and a global hierarchical codebook, combined with soft-routing attention to aggregate historical behavioral signals in semantic space. The approach effectively balances personalization, fine-grained interest representation, and long-tail interest preservation, significantly reducing quantization error and computational overhead. Deployed on Taobaoโ€™s โ€œGuess You Likeโ€ homepage, online A/B tests demonstrate a 1.65% increase in click-through rate alongside substantially reduced inference costs.

Technology Category

Application Category

๐Ÿ“ Abstract
Modern recommender systems leverage ultra-long user behavior sequences to capture dynamic preferences, but end-to-end modeling is infeasible in production due to latency and memory constraints. While summarizing history via interest centers offers a practical alternative, existing methods struggle to (1) identify user-specific centers at appropriate granularity and (2) accurately assign behaviors, leading to quantization errors and loss of long-tail preferences. To alleviate these issues, we propose Hierarchical Sparse Activation Compression (HiSAC), an efficient framework for personalized sequence modeling. HiSAC encodes interactions into multi-level semantic IDs and constructs a global hierarchical codebook. A hierarchical voting mechanism sparsely activates personalized interest-agents as fine-grained preference centers. Guided by these agents, Soft-Routing Attention aggregates historical signals in semantic space, weighting by similarity to minimize quantization error and retain long-tail behaviors. Deployed on Taobao's"Guess What You Like"homepage, HiSAC achieves significant compression and cost reduction, with online A/B tests showing a consistent 1.65% CTR uplift -- demonstrating its scalability and real-world effectiveness.
Problem

Research questions and friction points this paper is trying to address.

ultra-long sequence modeling
interest centers
quantization error
long-tail preferences
recommender systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Sparse Activation
Semantic Codebook
Interest-Agent
Soft-Routing Attention
Ultra-long Sequence Modeling
๐Ÿ”Ž Similar Papers
No similar papers found.