Learning Variable-Length Tokenization for Generative Recommendation

📅 2026-05-17

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses a critical limitation in existing generative recommendation methods, which employ fixed-length ID embeddings and thus fail to accommodate the semantic disparity between popular and long-tail items, leading to representation mismatch. To overcome this, we propose VarLenRec, a novel framework that introduces, for the first time, a variable-length ID mechanism. We identify and formalize the “popularity–length paradox” and devise an information-theoretic adaptive semantic encoding strategy. Our approach integrates hyperbolic residual quantization with a popularity-prior-guided soft length controller to dynamically allocate embedding lengths. Extensive experiments demonstrate that VarLenRec significantly improves recommendation accuracy across multiple benchmark datasets while maintaining training and inference efficiency, outperforming state-of-the-art methods.

📝 Abstract

Generative recommendation reformulates recommendation as next-token prediction over discrete semantic identifiers (IDs). A fundamental yet unexplored design choice is that existing methods employ fixed-length tokenization for all items, implicitly assuming uniform encoding capacity regardless of item characteristics. Through systematic experiments across four datasets, we discover the Popularity-Length Paradox: popular items achieve optimal performance with short IDs, while tail items require substantially longer codes to capture discriminative semantics. This reveals a critical mismatch where popular items benefit from abundant collaborative signals and require minimal semantic detail, whereas tail items must rely on fine-grained content features due to sparse interaction data. To address this, we propose VarLenRec, a framework for learning variable-length tokenization. We develop Popularity-Weighted Information Budget Allocation (PIBA), an information-theoretic framework proving that optimal ID length should scale as a negative power of popularity. Directly implementing variable-length allocation faces two technical challenges: standard Euclidean residual quantization lacks geometric capacity to support diverse code lengths without distortion, and discrete length decisions are non-differentiable. We address these through Hyperbolic Residual Quantization, which leverages the exponential volume growth of the Poincaré ball to naturally stratify encoding capacity, and a Soft Length Controller, which enables differentiable length prediction via continuous layer retention probabilities regularized by PIBA-derived priors. Extensive experiments demonstrate that VarLenRec achieves significant improvements over state-of-the-art methods in recommendation accuracy and training/inference efficiency, revealing the importance of adaptive encoding capacity in generative recommendation.

Problem

Research questions and friction points this paper is trying to address.

generative recommendation

variable-length tokenization

Popularity-Length Paradox

encoding capacity

discrete semantic identifiers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Variable-Length Tokenization

Generative Recommendation

Hyperbolic Residual Quantization