Position-Aware Sequential Attention for Accurate Next Item Recommendations

📅 2026-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of conventional sequential recommendation models that employ additive positional encoding, which entangles positional information with semantic content and suffers from signal decay in deep networks, thereby hindering the capture of complex temporal patterns. To overcome this, the authors propose a kernelized self-attention mechanism that leverages a learnable positional kernel to dynamically modulate attention weights within an independent positional space. This design effectively decouples position from semantics and enables multi-scale temporal modeling. Integrated into a Transformer architecture, the proposed method achieves significant performance gains over strong baselines on standard next-item recommendation benchmarks, demonstrating both its effectiveness and generalization capability.

Technology Category

Application Category

📝 Abstract
Sequential self-attention models usually rely on additive positional embeddings, which inject positional information into item representations at the input. In the absence of positional signals, the attention block is permutation-equivariant over sequence positions and thus has no intrinsic notion of temporal order beyond causal masking. We argue that additive positional embeddings make the attention mechanism only superficially sensitive to sequence order: positional information is entangled with item embedding semantics, propagates weakly in deep architectures, and limits the ability to capture rich sequential patterns. To address these limitations, we introduce a kernelized self-attention mechanism, where a learnable positional kernel operates purely in the position space, disentangled from semantic similarity, and directly modulates attention weights. When applied per attention block, this kernel enables adaptive multi-scale sequential modeling. Experiments on standard next-item prediction benchmarks show that our positional kernel attention consistently improves over strong competing baselines.
Problem

Research questions and friction points this paper is trying to address.

sequential recommendation
self-attention
positional embedding
sequence modeling
next item prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

positional kernel
disentangled position modeling
kernelized self-attention
sequential recommendation
position-aware attention
🔎 Similar Papers
No similar papers found.
T
Timur Nabiev
Skolkovo Institute of Science and Technology
Evgeny Frolov
Evgeny Frolov
AIRI
Recommender SystemsTensor FactorizationHyperbolic Geometry