CTR-Driven Ad Text Generation via Online Feedback Preference Optimization

πŸ“… 2025-07-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing LLM-generated ad copy often underperforms human-written copy in click-through rate (CTR), revealing a critical gap between generative quality and online effectiveness. To bridge this gap, we propose an end-to-end CTR-driven framework for automated ad copy generation. In the first stage, we employ retrieval-augmented generation (RAG) combined with chain-of-thought (CoT) exemplars to enable diverse, high-quality copy sampling. In the second stage, we dynamically optimize the generation policy using online CTR gains and confidence-weighted preference signals derived from real-time user feedback. By unifying RAG, in-context learning, and online preference optimization, our approach circumvents biases inherent in offline evaluation. Empirical evaluation on a large-scale e-commerce platform demonstrates significant improvements in both offline diversity and relevance metrics, as well as a +12.7% lift in online CTRβ€”marking the first paradigm shift in LLM-based ad generation from text-quality-centric design to closed-loop conversion-optimized deployment.

Technology Category

Application Category

πŸ“ Abstract
Advertising text plays a critical role in determining click-through rates (CTR) in online advertising. Large Language Models (LLMs) offer significant efficiency advantages over manual ad text creation. However, LLM-generated ad texts do not guarantee higher CTR performance compared to human-crafted texts, revealing a gap between generation quality and online performance of ad texts. In this work, we propose a novel ad text generation method which optimizes for CTR through preference optimization from online feedback. Our approach adopts an innovative two-stage framework: (1) diverse ad text sampling via one-shot in-context learning, using retrieval-augmented generation (RAG) to provide exemplars with chain-of-thought (CoT) reasoning; (2) CTR-driven preference optimization from online feedback, which weighs preference pairs according to their CTR gains and confidence levels. Through our method, the resulting model enables end-to-end generation of high-CTR ad texts. Extensive experiments have demonstrated the effectiveness of our method in both offline and online metrics. Notably, we have applied our method on a large-scale online shopping platform and achieved significant CTR improvements, showcasing its strong applicability and effectiveness in advertising systems.
Problem

Research questions and friction points this paper is trying to address.

Optimizing ad text generation for higher click-through rates (CTR)
Bridging quality-performance gap in LLM-generated ad texts
Enhancing ad relevance via online feedback preference optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

CTR-driven preference optimization from online feedback
Retrieval-augmented generation with chain-of-thought reasoning
Two-stage framework for diverse ad text sampling
Yanda Chen
Yanda Chen
Anthropic
Natural Language ProcessingMachine Learning
Z
Zihui Ren
Taobao & Tmall Group of Alibaba
Q
Qixiang Gao
Taobao & Tmall Group of Alibaba
J
Jiale Chen
Taobao & Tmall Group of Alibaba
S
Si Chen
Taobao & Tmall Group of Alibaba
X
Xubin Li
Taobao & Tmall Group of Alibaba
Tiezheng Ge
Tiezheng Ge
Senior staff algorithm engineer, Alimama, Alibaba Group
Computer VisionAIGCRecommender Systems
B
Bo Zheng
Taobao & Tmall Group of Alibaba