🤖 AI Summary
This work addresses a critical limitation in existing generative recommender systems, which treat all tokens equally in semantic ID modeling despite their varying information content, thereby inducing recommendation bias and weakening semantic representation. To mitigate this, the paper introduces an information gain–driven token weighting mechanism, proposing two novel strategies—Front-Greater and Frequency—and integrates them within a curriculum learning framework to dynamically adjust loss weights across training stages. This approach effectively alleviates popularity bias and substantially enhances modeling capacity for long-tail items. Extensive experiments demonstrate consistent superiority over state-of-the-art methods across multiple benchmark datasets, with significant performance gains observed for both head and tail items. Moreover, the model exhibits strong generalization and robustness under diverse semantic ID constructions.
📝 Abstract
Generative recommender systems have recently attracted attention by formulating next-item prediction as an autoregressive sequence generation task. However, most existing methods optimize standard next-token likelihood and implicitly treat all tokens as equally informative, which is misaligned with semantic-ID-based generation. Accordingly, we propose two complementary information-gain-based token-weighting strategies tailored to generative recommendation with semantic IDs. Front-Greater Weighting captures conditional semantic information gain by prioritizing early tokens that most effectively reduce candidate-item uncertainty given their prefixes and encode coarse semantics. Frequency Weighting models marginal information gain under long-tailed item and token distributions, upweighting rare tokens to counteract popularity bias. Beyond individual strategies, we introduce a multi-target learning framework with curriculum learning that jointly optimizes the two token-weighted objectives alongside standard likelihood, enabling stable optimization and adaptive emphasis across training stages. Extensive experiments on benchmark datasets show that our method consistently outperforms strong baselines and existing token-weighting approaches, with improved robustness, strong generalization across different semantic-ID constructions, and substantial gains on both head and tail items. Code is available at https://github.com/CHIUWEINING/Token-Weighted-Multi-Target-Learning-for-Generative-Recommenders-with-Curriculum-Learning.