Ad Insertion in LLM-Generated Responses

📅 2026-01-27

📈 Citations: 1

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the challenges of integrating search advertisements into large language model (LLM) responses, where traditional search advertising struggles to capture transient, context-dependent user intent in conversational settings. The authors propose a decoupled architecture that separates ad insertion from response generation and replaces raw user queries with semantic “genres” for ad auctions, substantially reducing privacy risks and computational overhead. By incorporating a Vickrey–Clarke–Groves (VCG) auction mechanism, the framework achieves, for the first time, approximate incentive compatibility, individual rationality, and social welfare optimality in LLM-based ad placement. Experimental results demonstrate that the proposed LLM-as-a-Judge evaluation metric correlates strongly with human judgments (Spearman’s ρ ≈ 0.66), outperforming 80% of human evaluators while maintaining high efficiency and compliance with multidimensional constraints.

Technology Category

Application Category

📝 Abstract

Sustainable monetization of Large Language Models (LLMs) remains a critical open challenge. Traditional search advertising, which relies on static keywords, fails to capture the fleeting, context-dependent user intents--the specific information, goods, or services a user seeks--embedded in conversational flows. Beyond the standard goal of social welfare maximization, effective LLM advertising imposes additional requirements on contextual coherence (ensuring ads align semantically with transient user intents) and computational efficiency (avoiding user interaction latency), as well as adherence to ethical and regulatory standards, including preserving privacy and ensuring explicit ad disclosure. Although various recent solutions have explored bidding on token-level and query-level, both categories of approaches generally fail to holistically satisfy this multifaceted set of constraints. We propose a practical framework that resolves these tensions through two decoupling strategies. First, we decouple ad insertion from response generation to ensure safety and explicit disclosure. Second, we decouple bidding from specific user queries by using ``genres''(high-level semantic clusters) as a proxy. This allows advertisers to bid on stable categories rather than sensitive real-time response, reducing computational burden and privacy risks. We demonstrate that applying the VCG auction mechanism to this genre-based framework yields approximately dominant strategy incentive compatibility (DSIC) and individual rationality (IR), as well as approximately optimal social welfare, while maintaining high computational efficiency. Finally, we introduce an"LLM-as-a-Judge"metric to estimate contextual coherence. Our experiments show that this metric correlates strongly with human ratings (Spearman's $\rho\approx 0.66$), outperforming 80% of individual human evaluators.

Problem

Research questions and friction points this paper is trying to address.

Ad Insertion

Large Language Models

Contextual Coherence

Computational Efficiency

Privacy Preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

ad insertion

genre-based bidding

VCG auction

contextual coherence

LLM-as-a-Judge

🔎 Similar Papers

GenAI Advertising: Risks of Personalizing Ads with LLMs

2024-09-23arXiv.orgCitations: 3

Ad Auctions for LLMs via Retrieval Augmented Generation

2024-06-12Neural Information Processing SystemsCitations: 8