HiSAC: Hierarchical Sparse Activation Compression for Ultra-long Sequence Modeling in Recommenders

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Modeling ultra-long user behavior sequences in recommender systems is hindered by latency and memory constraints, and existing interest-centric approaches struggle to capture fine-grained preferences while often neglecting long-tail interests. This work proposes a hierarchical sparse activation mechanism that constructs personalized interest proxies through multi-level semantic IDs and a global hierarchical codebook, combined with soft-routing attention to aggregate historical behavioral signals in semantic space. The approach effectively balances personalization, fine-grained interest representation, and long-tail interest preservation, significantly reducing quantization error and computational overhead. Deployed on Taobao’s “Guess You Like” homepage, online A/B tests demonstrate a 1.65% increase in click-through rate alongside substantially reduced inference costs.

Technology Category

Application Category

📝 Abstract

Modern recommender systems leverage ultra-long user behavior sequences to capture dynamic preferences, but end-to-end modeling is infeasible in production due to latency and memory constraints. While summarizing history via interest centers offers a practical alternative, existing methods struggle to (1) identify user-specific centers at appropriate granularity and (2) accurately assign behaviors, leading to quantization errors and loss of long-tail preferences. To alleviate these issues, we propose Hierarchical Sparse Activation Compression (HiSAC), an efficient framework for personalized sequence modeling. HiSAC encodes interactions into multi-level semantic IDs and constructs a global hierarchical codebook. A hierarchical voting mechanism sparsely activates personalized interest-agents as fine-grained preference centers. Guided by these agents, Soft-Routing Attention aggregates historical signals in semantic space, weighting by similarity to minimize quantization error and retain long-tail behaviors. Deployed on Taobao's"Guess What You Like"homepage, HiSAC achieves significant compression and cost reduction, with online A/B tests showing a consistent 1.65% CTR uplift -- demonstrating its scalability and real-world effectiveness.

Problem

Research questions and friction points this paper is trying to address.

ultra-long sequence modeling

interest centers

quantization error

long-tail preferences

recommender systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Sparse Activation

Semantic Codebook

Interest-Agent