Semantic IDs for Joint Generative Search and Recommendation

📅 2025-08-14

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This work addresses the challenge of designing semantic IDs for joint search-and-recommendation tasks. We propose a large language model–driven unified generative framework that jointly fine-tunes dual encoders to construct a shared semantic ID space across both tasks. A discrete codebook is introduced to map item embeddings into interpretable, reusable semantic IDs. By unifying representation learning under a single architecture, our approach eliminates task-specific encoding fragmentation and simultaneously optimizes search relevance and recommendation accuracy. Extensive experiments demonstrate that the proposed method significantly outperforms both single-task baselines and existing multi-task encoding schemes on both search and recommendation benchmarks. These results validate the effectiveness, generalizability, and cross-task knowledge transfer capability of the unified semantic ID representation within generative multi-task systems.

Technology Category

Application Category

📝 Abstract

Generative models powered by Large Language Models (LLMs) are emerging as a unified solution for powering both recommendation and search tasks. A key design choice in these models is how to represent items, traditionally through unique identifiers (IDs) and more recently with Semantic IDs composed of discrete codes, obtained from embeddings. While task-specific embedding models can improve performance for individual tasks, they may not generalize well in a joint setting. In this paper, we explore how to construct Semantic IDs that perform well both in search and recommendation when using a unified model. We compare a range of strategies to construct Semantic IDs, looking into task-specific and cross-tasks approaches, and also whether each task should have its own semantic ID tokens in a joint search and recommendation generative model. Our results show that using a bi-encoder model fine-tuned on both search and recommendation tasks to obtain item embeddings, followed by the construction of a unified Semantic ID space provides an effective trade-off, enabling strong performance in both tasks. We hope these findings spark follow-up work on generalisable, semantically grounded ID schemes and inform the next wave of unified generative recommender architectures.

Problem

Research questions and friction points this paper is trying to address.

Constructing Semantic IDs for joint search and recommendation tasks

Comparing task-specific vs cross-task Semantic ID strategies

Optimizing unified generative models for both search and recommendation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-encoder model for joint task embeddings

Unified Semantic ID space construction

Cross-task fine-tuning for generalization

🔎 Similar Papers

No similar papers found.