Align$^3$GR: Unified Multi-Level Alignment for LLM-based Generative Recommendation

📅 2025-11-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the fundamental misalignment between semantic representations and user behavioral signals in large language models (LLMs) for generative recommendation, this paper proposes a multi-level alignment framework. The framework establishes unified alignment mechanisms at the token level, behavioral modeling level, and preference level. It introduces bidirectional semantic alignment to enhance behavioral representation learning and designs a progressive preference optimization strategy that synergistically integrates self-play Direct Preference Optimization (SP-DPO) and real-feedback DPO (RF-DPO). Methodologically, it employs a dual-tokenization scheme and enhanced behavioral modeling. Extensive experiments on public benchmarks demonstrate improvements of 17.8% in Recall@10 and 20.2% in NDCG@10 over strong baselines. Furthermore, full-scale industrial deployment and online A/B testing confirm its significant practical effectiveness.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) demonstrate significant advantages in leveraging structured world knowledge and multi-step reasoning capabilities. However, fundamental challenges arise when transforming LLMs into real-world recommender systems due to semantic and behavioral misalignment. To bridge this gap, we propose Align$^3$GR, a novel framework that unifies token-level, behavior modeling-level, and preference-level alignment. Our approach introduces: Dual tokenization fusing user-item semantic and collaborative signals. Enhanced behavior modeling with bidirectional semantic alignment. Progressive DPO strategy combining self-play (SP-DPO) and real-world feedback (RF-DPO) for dynamic preference adaptation. Experiments show Align$^3$GR outperforms the SOTA baseline by +17.8% in Recall@10 and +20.2% in NDCG@10 on the public dataset, with significant gains in online A/B tests and full-scale deployment on an industrial large-scale recommendation platform.
Problem

Research questions and friction points this paper is trying to address.

Bridging semantic and behavioral misalignment in LLM-based recommender systems
Unifying multi-level alignment through token, behavior, and preference integration
Enhancing recommendation accuracy via dynamic preference adaptation strategies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual tokenization fusing semantic and collaborative signals
Enhanced behavior modeling with bidirectional semantic alignment
Progressive DPO strategy combining self-play and real-world feedback
🔎 Similar Papers
No similar papers found.
W
Wencai Ye
Kuaishou Technology, China
Mingjie Sun
Mingjie Sun
Thinking Machines Lab
Shuhang Chen
Shuhang Chen
Zhejiang University
deep learningcomputer vision
W
Wenjin Wu
Kuaishou Technology, China
P
Peng Jiang
Kuaishou Technology, China