🤖 AI Summary
Traditional deep learning-based recommendation models face limitations in performance, efficiency, generalization, and long-sequence modeling. This work proposes GRAB, an end-to-end generative CTR prediction framework inspired by large language models and grounded in a “sequence-first” paradigm for user behavior modeling. Its core innovation is the Causal Action-aware Multi-channel Attention (CamA) mechanism, which effectively captures temporal dynamics and action signals within user behavior sequences while enabling efficient scaling. Experimental results demonstrate that upon full-scale deployment, GRAB achieves a 3.49% increase in CTR and a 3.05% boost in revenue. Moreover, the model’s representational capacity scales nearly linearly with sequence length, highlighting its strong adaptability to extended user histories.
📝 Abstract
Traditional Deep Learning Recommendation Models (DLRMs) face increasing bottlenecks in performance and efficiency, often struggling with generalization and long-sequence modeling. Inspired by the scaling success of Large Language Models (LLMs), we propose Generative Ranking for Ads at Baidu (GRAB), an end-to-end generative framework for Click-Through Rate (CTR) prediction. GRAB integrates a novel Causal Action-aware Multi-channel Attention (CamA) mechanism to effectively capture temporal dynamics and specific action signals within user behavior sequences. Full-scale online deployment demonstrates that GRAB significantly outperforms established DLRMs, delivering a 3.05% increase in revenue and a 3.49% rise in CTR. Furthermore, the model demonstrates desirable scaling behavior: its expressive power shows a monotonic and approximately linear improvement as longer interaction sequences are utilized.