Align$^3$GR: Unified Multi-Level Alignment for LLM-based Generative Recommendation

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

To address the fundamental misalignment between semantic representations and user behavioral signals in large language models (LLMs) for generative recommendation, this paper proposes a multi-level alignment framework. The framework establishes unified alignment mechanisms at the token level, behavioral modeling level, and preference level. It introduces bidirectional semantic alignment to enhance behavioral representation learning and designs a progressive preference optimization strategy that synergistically integrates self-play Direct Preference Optimization (SP-DPO) and real-feedback DPO (RF-DPO). Methodologically, it employs a dual-tokenization scheme and enhanced behavioral modeling. Extensive experiments on public benchmarks demonstrate improvements of 17.8% in Recall@10 and 20.2% in NDCG@10 over strong baselines. Furthermore, full-scale industrial deployment and online A/B testing confirm its significant practical effectiveness.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) demonstrate significant advantages in leveraging structured world knowledge and multi-step reasoning capabilities. However, fundamental challenges arise when transforming LLMs into real-world recommender systems due to semantic and behavioral misalignment. To bridge this gap, we propose Align$^3$GR, a novel framework that unifies token-level, behavior modeling-level, and preference-level alignment. Our approach introduces: Dual tokenization fusing user-item semantic and collaborative signals. Enhanced behavior modeling with bidirectional semantic alignment. Progressive DPO strategy combining self-play (SP-DPO) and real-world feedback (RF-DPO) for dynamic preference adaptation. Experiments show Align$^3$GR outperforms the SOTA baseline by +17.8% in Recall@10 and +20.2% in NDCG@10 on the public dataset, with significant gains in online A/B tests and full-scale deployment on an industrial large-scale recommendation platform.

Problem

Research questions and friction points this paper is trying to address.

Bridging semantic and behavioral misalignment in LLM-based recommender systems

Unifying multi-level alignment through token, behavior, and preference integration

Enhancing recommendation accuracy via dynamic preference adaptation strategies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual tokenization fusing semantic and collaborative signals

Enhanced behavior modeling with bidirectional semantic alignment

Progressive DPO strategy combining self-play and real-world feedback

🔎 Similar Papers

No similar papers found.