SOLAR: SVD-Optimized Lifelong Attention for Recommendation

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the computational and memory bottlenecks of conventional attention mechanisms, which exhibit O(N²d) complexity and hinder scalability in large-scale recommendation systems when modeling long user behavior sequences. To overcome this limitation, the authors propose SVD-Attention, the first approach to integrate singular value decomposition (SVD) into the attention mechanism. By leveraging the inherent low-rank structure of user behavior sequences while preserving the softmax formulation, SVD-Attention reduces complexity to O(Ndr), achieving theoretically lossless compression. The method balances expressive power and efficiency, enabling effective modeling of sequences with tens of thousands of interactions and candidate sets in the thousands. Deployed in Kuaishou’s online recommendation system, it yielded a 0.68% increase in video views and significant improvements across multiple core business metrics.

Technology Category

Application Category

📝 Abstract

Attention mechanism remains the defining operator in Transformers since it provides expressive global credit assignment, yet its $O(N^2 d)$ time and memory cost in sequence length $N$ makes long-context modeling expensive and often forces truncation or other heuristics. Linear attention reduces complexity to $O(N d^2)$ by reordering computation through kernel feature maps, but this reformulation drops the softmax mechanism and shifts the attention score distribution. In recommender systems, low-rank structure in matrices is not a rare case, but rather the default inductive bias in its representation learning, particularly explicit in the user behavior sequence modeling. Leveraging this structure, we introduce SVD-Attention, which is theoretically lossless on low-rank matrices and preserves softmax while reducing attention complexity from $O(N^2 d)$ to $O(Ndr)$. With SVD-Attention, we propose SOLAR, SVD-Optimized Lifelong Attention for Recommendation, a sequence modeling framework that supports behavior sequences of ten-thousand scale and candidate sets of several thousand items in cascading process without any filtering. In Kuaishou's online recommendation scenario, SOLAR delivers a 0.68\% Video Views gain together with additional business metrics improvements.

Problem

Research questions and friction points this paper is trying to address.

attention mechanism

long-context modeling

recommender systems

computational complexity

low-rank structure

Innovation

Methods, ideas, or system contributions that make the work stand out.

SVD-Attention

low-rank approximation

long-sequence recommendation