🤖 AI Summary
Current generative recommendation (GR) research suffers from inconsistent modeling paradigms and non-uniform hyperparameter/experimental settings, hindering fair model comparison; moreover, the absence of open-source benchmarking frameworks impedes systematic evaluation and rapid iteration. To address these challenges, we propose and open-source GRID—the first modular GR research framework designed for semantic IDs. Its core innovation lies in mapping continuous semantic representations into discrete ID sequences, thereby unifying large language model–based semantic understanding with collaborative filtering signals and enabling end-to-end discrete decoding. Ablation studies reveal the critical yet previously overlooked impact of several architectural components. GRID provides reproducible benchmarks, a flexible component-swapping mechanism, and standardized evaluation protocols—significantly enhancing development efficiency, cross-model comparability, and scientific rigor in GR research, while fostering open collaboration and principled assessment.
📝 Abstract
Generative recommendation (GR) has gained increasing attention for its promising performance compared to traditional models. A key factor contributing to the success of GR is the semantic ID (SID), which converts continuous semantic representations (e.g., from large language models) into discrete ID sequences. This enables GR models with SIDs to both incorporate semantic information and learn collaborative filtering signals, while retaining the benefits of discrete decoding. However, varied modeling techniques, hyper-parameters, and experimental setups in existing literature make direct comparisons between GR proposals challenging. Furthermore, the absence of an open-source, unified framework hinders systematic benchmarking and extension, slowing model iteration. To address this challenge, our work introduces and open-sources a framework for Generative Recommendation with semantic ID, namely GRID, specifically designed for modularity to facilitate easy component swapping and accelerate idea iteration. Using GRID, we systematically experiment with and ablate different components of GR models with SIDs on public benchmarks. Our comprehensive experiments with GRID reveal that many overlooked architectural components in GR models with SIDs substantially impact performance. This offers both novel insights and validates the utility of an open-source platform for robust benchmarking and GR research advancement. GRID is open-sourced at https://github.com/snap-research/GRID.