🤖 AI Summary
This work proposes AdNanny, the first unified inference-oriented large language model designed as a shared backbone for multiple tasks in offline advertising recommendation systems—such as query-ad relevance, keyword generation, and user profiling—addressing the redundancy and high maintenance costs of training separate large models per task. AdNanny integrates multi-task supervised fine-tuning, an adaptive reweighting mechanism, and inference-enhanced data construction, further aligned with online objectives through downstream metric-driven reinforcement learning. Evaluated in the Bing Ads production environment, AdNanny significantly improves accuracy across multiple tasks, reduces reliance on costly human annotations, and successfully replaces several specialized models, enabling a more efficient and scalable advertising system architecture.
📝 Abstract
Large Language Models (LLMs) have shown strong capabilities in Natural Language Understanding and Generation, but deploying them directly in online advertising systems is often impractical due to strict millisecond-level latency constraints. This has motivated the use of LLMs offline to improve retrieval, ranking, and recommendation models. Existing solutions typically fine-tune separate LLMs for individual tasks such as query-ad relevance labeling, keyword-based query generation, and user profiling. This results in redundant models, high maintenance cost, and limited performance gains despite substantial overlap in domain knowledge and reasoning patterns. We introduce AdNanny, a unified reasoning-centric LLM that serves as a shared backbone for offline advertising tasks. AdNanny is obtained by fine-tuning a public 671B-parameter DeepSeek-R1 checkpoint using a scalable training system that supports hybrid dense-MoE parallelism. We construct reasoning-augmented corpora that pair structured supervision with step-by-step natural language explanations. A multi-task supervised fine-tuning stage with adaptive reweighting enables AdNanny to handle diverse labeling and generation tasks in a consistent reasoning format. This is followed by reinforcement learning using downstream advertising metrics to align model behavior with online retrieval and ranking objectives. AdNanny is deployed in production within Bing Ads, where it significantly reduces manual labeling effort and improves accuracy across multiple offline tasks. By consolidating many task-specific models into a single reasoning-centric foundation model, AdNanny provides a scalable and cost-effective solution for large-scale advertising systems.