LLaDA-Rec: Discrete Diffusion for Parallel Semantic ID Generation in Generative Recommendation

📅 2025-11-09

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Existing autoregressive generative recommendation models suffer from unidirectional causal attention—hindering global semantic modeling—and error accumulation due to fixed left-to-right token generation. This paper proposes LLaDA-Rec, the first discrete diffusion-based recommendation framework for semantic ID generation. Built upon a bidirectional Transformer, it introduces parallel tokenization, user-item two-level masking, and a diffusion-adapted beam search strategy, enabling non-autoregressive, high-quality sequence generation while lifting causal constraints. Its core innovation lies in pioneering the integration of discrete diffusion into recommender systems, coupled with adaptive generation ordering to jointly model inter-item and intra-item dependencies. Extensive experiments on three real-world datasets demonstrate significant improvements over state-of-the-art ID-based and generative baselines, validating both the effectiveness and superiority of the discrete diffusion paradigm for recommendation tasks.

Technology Category

Application Category

📝 Abstract

Generative recommendation represents each item as a semantic ID, i.e., a sequence of discrete tokens, and generates the next item through autoregressive decoding. While effective, existing autoregressive models face two intrinsic limitations: (1) unidirectional constraints, where causal attention restricts each token to attend only to its predecessors, hindering global semantic modeling; and (2) error accumulation, where the fixed left-to-right generation order causes prediction errors in early tokens to propagate to the predictions of subsequent token. To address these issues, we propose LLaDA-Rec, a discrete diffusion framework that reformulates recommendation as parallel semantic ID generation. By combining bidirectional attention with the adaptive generation order, the approach models inter-item and intra-item dependencies more effectively and alleviates error accumulation. Specifically, our approach comprises three key designs: (1) a parallel tokenization scheme that produces semantic IDs for bidirectional modeling, addressing the mismatch between residual quantization and bidirectional architectures; (2) two masking mechanisms at the user-history and next-item levels to capture both inter-item sequential dependencies and intra-item semantic relationships; and (3) an adapted beam search strategy for adaptive-order discrete diffusion decoding, resolving the incompatibility of standard beam search with diffusion-based generation. Experiments on three real-world datasets show that LLaDA-Rec consistently outperforms both ID-based and state-of-the-art generative recommenders, establishing discrete diffusion as a new paradigm for generative recommendation.

Problem

Research questions and friction points this paper is trying to address.

Autoregressive models suffer from unidirectional constraints limiting global semantic modeling

Fixed generation order causes error accumulation in sequential token predictions

Existing methods struggle with inter-item and intra-item dependency modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel discrete diffusion for semantic ID generation

Bidirectional attention with adaptive generation order

Masking mechanisms and adapted beam search decoding

🔎 Similar Papers

STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM