ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation

📅 2025-02-19

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

Existing generative recommendation (GR) models map user action sequences into fixed discrete tokens, ignoring contextual semantic variations of actions and thus suffering from context insensitivity. To address this, we propose a context-aware action tokenization method: first, modeling each action as a feature set rather than an atomic symbol; second, dynamically constructing a semantics-enriched, scenario-adaptive vocabulary based on action co-occurrence statistics; and third, introducing set-permutation regularization to explicitly model the inherent unorderedness of action sets, thereby enabling multi-granularity semantically equivalent sequence segmentation. Crucially, our approach is the first to embed contextual dependency directly into the action tokenization process. Extensive experiments on multiple public benchmarks demonstrate consistent improvements—achieving 6.00%–12.82% gains in NDCG@10 over state-of-the-art tokenization strategies—validating its effectiveness in capturing contextual semantics for generative recommendation.

Technology Category

Application Category

📝 Abstract

Generative recommendation (GR) is an emerging paradigm where user actions are tokenized into discrete token patterns and autoregressively generated as predictions. However, existing GR models tokenize each action independently, assigning the same fixed tokens to identical actions across all sequences without considering contextual relationships. This lack of context-awareness can lead to suboptimal performance, as the same action may hold different meanings depending on its surrounding context. To address this issue, we propose ActionPiece to explicitly incorporate context when tokenizing action sequences. In ActionPiece, each action is represented as a set of item features, which serve as the initial tokens. Given the action sequence corpora, we construct the vocabulary by merging feature patterns as new tokens, based on their co-occurrence frequency both within individual sets and across adjacent sets. Considering the unordered nature of feature sets, we further introduce set permutation regularization, which produces multiple segmentations of action sequences with the same semantics. Experiments on public datasets demonstrate that ActionPiece consistently outperforms existing action tokenization methods, improving NDCG@$10$ by $6.00%$ to $12.82%$.

Problem

Research questions and friction points this paper is trying to address.

Context-aware tokenization of actions

Improving generative recommendation models

Enhancing sequence prediction accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contextual action tokenization

Feature pattern merging

Set permutation regularization

🔎 Similar Papers

STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM