How Well Does Generative Recommendation Generalize?

๐Ÿ“… 2026-03-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the lack of systematic evaluation of the generalization capabilities of generative recommender models, which often obscures whether their performance stems from genuine generalization or mere memorization of training data. For the first time, we decouple memorization and generalization at the instance level and introduce a classification-based evaluation framework that reveals how the apparent โ€œgeneralizationโ€ of generative models frequently relies on token-level memorization, whereas traditional ID-based models exhibit superior performance in memorization tasks. Building on this insight, we propose memory-aware metrics and an adaptive fusion strategy that dynamically leverages the complementary strengths of both model types. Extensive experiments demonstrate that our approach significantly enhances overall recommendation performance, thereby validating the effectiveness of such a hybrid, memory-conscious design.

Technology Category

Application Category

๐Ÿ“ Abstract
A widely held hypothesis for why generative recommendation (GR) models outperform conventional item ID-based models is that they generalize better. However, there is few systematic way to verify this hypothesis beyond a superficial comparison of overall performance. To address this gap, we categorize each data instance based on the specific capability required for a correct prediction: either memorization (reusing item transition patterns observed during training) or generalization (composing known patterns to predict unseen item transitions). Extensive experiments show that GR models perform better on instances that require generalization, whereas item ID-based models perform better when memorization is more important. To explain this divergence, we shift the analysis from the item level to the token level and show that what appears to be item-level generalization often reduces to token-level memorization for GR models. Finally, we show that the two paradigms are complementary. We propose a simple memorization-aware indicator that adaptively combines them on a per-instance basis, leading to improved overall recommendation performance.
Problem

Research questions and friction points this paper is trying to address.

generative recommendation
generalization
memorization
item ID-based models
recommendation systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

generative recommendation
generalization
memorization
token-level analysis
adaptive fusion
๐Ÿ”Ž Similar Papers
No similar papers found.