Deploying Semantic ID-based Generative Retrieval for Large-Scale Podcast Discovery at Spotify

πŸ“… 2026-03-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Traditional recommender systems struggle to jointly model users’ long-term preferences and dynamic exploration intent. This work proposes GLIDE, a novel framework that formulates podcast recommendation as a semantic-ID-based instruction-following generation task. By discretizing the catalog into semantic IDs, leveraging large language models (LLMs) for semantic reasoning, and injecting user long-term embeddings via soft prompts, GLIDE enables efficient personalized generative retrieval. The approach effectively balances contextual awareness with low-latency inference, meeting industrial deployment constraints on cost and response time. In online A/B tests on Spotify, GLIDE increased engagement among non-habitual podcast listeners by 5.4% and boosted discovery of new shows by 14.3%, demonstrating its practical efficacy in real-world recommendation scenarios.

Technology Category

Application Category

πŸ“ Abstract
Podcast listening is often grounded in a set of favorite shows, while listener intent can evolve over time. This combination of stable preferences and changing intent motivates recommendation approaches that support both familiarity and exploration. Traditional recommender systems typically emphasize long-term interaction patterns, and are less explicitly designed to incorporate rich contextual signals or flexible, intent-aware discovery objectives. In this setting, models that can jointly reason over semantics, context, and user state offer a promising direction. Large Language Models (LLMs) provide strong semantic reasoning and contextual conditioning for discovery-oriented recommendation, but deploying them in production introduces challenges in catalog grounding, user-level personalization, and latency-critical serving. We address these challenges with GLIDE, a production-scale generative recommender for podcast discovery at Spotify. GLIDE formulates recommendation as an instruction-following task over a discretized catalog using Semantic IDs, enabling grounded generation over a large inventory. The model conditions on recent listening history and lightweight user context, while injecting long-term user embeddings as soft prompts to capture stable preferences under strict inference constraints. We evaluate GLIDE using offline retrieval metrics, human judgments, and LLM-based evaluation, and validate its impact through large-scale online A/B testing. Across experiments involving millions of users, GLIDE increases non-habitual podcast streaming on Spotify home surface by up to 5.4% and new-show discovery by up to 14.3%, while meeting production cost and latency constraints.
Problem

Research questions and friction points this paper is trying to address.

generative retrieval
podcast discovery
semantic reasoning
user intent
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Retrieval
Semantic ID
Large Language Models
Personalized Recommendation
Instruction-Following
πŸ”Ž Similar Papers
No similar papers found.
Edoardo D'Amico
Edoardo D'Amico
Research scientist @ Spotify
Recommender systemsGraph Representation Learning
Marco De Nadai
Marco De Nadai
Research Scientist, Spotify
Recommendation SystemsRepresentation LearningComputer VisionComputational Social ScienceMachine Learning
Praveen Chandar
Praveen Chandar
Spotify
Information RetrievalRecSysLLMsGenerative AI
D
Divita Vohra
Spotify
S
Shawn Lin
Spotify
M
Max Lefarov
Spotify
P
Paul Gigioli
Spotify
Gustavo Penha
Gustavo Penha
Spotify Research
Recommender SystemsInformation RetrievalNatural Language ProcessingMachine Learning
I
Ilya Kopysitsky
Spotify
I
Ivo Joel Senese
Spotify
D
Darren Mei
Spotify
Francesco Fabbri
Francesco Fabbri
Spotify
Recommender SystemsMachine LearningPersonalization
O
Oguz Semerci
Spotify
Y
Yu Zhao
Spotify
Vincent Tang
Vincent Tang
Lawrence Livermore National Laboratory
Plasma Physics
Brian St. Thomas
Brian St. Thomas
Spotify
A
Alexandra Ranieri
Spotify
M
Matthew N. K. Smith
Spotify
A
Aaron Bernkopf
Spotify
B
Bryan Leung
Spotify
G
Ghazal Fazelnia
Spotify
M
Mark VanMiddlesworth
Spotify
T
Timothy Christopher Heath
Spotify
P
Petter Pehrson Skiden
Spotify
A
Alice Y. Wang
Spotify