🤖 AI Summary
Target-driven persuasive dialogue systems—e.g., telemarketing agents—suffer from brittle multi-turn strategy planning and frequent factual hallucinations. To address these issues, this paper proposes AI-Salesman, a two-stage framework: (1) Bayesian supervised reinforcement learning to learn robust sales policies from real-world noisy dialogue data; and (2) a Dynamic Outline-Guided Agent (DOGA) that provides structured strategic guidance during inference while integrating a pre-constructed, fact-checked script repository to ensure factual consistency. We introduce TeleSalesCorpus, the first high-quality, real-world telemarketing dialogue dataset, and design an LLM-as-a-Judge evaluation paradigm alongside a fine-grained sales competency assessment framework. Experiments demonstrate that AI-Salesman significantly outperforms existing baselines in both automated metrics and human evaluations, substantially improving strategic robustness and factual accuracy.
📝 Abstract
Goal-driven persuasive dialogue, exemplified by applications like telemarketing, requires sophisticated multi-turn planning and strict factual faithfulness, which remains a significant challenge for even state-of-the-art Large Language Models (LLMs). A lack of task-specific data often limits previous works, and direct LLM application suffers from strategic brittleness and factual hallucination. In this paper, we first construct and release TeleSalesCorpus, the first real-world-grounded dialogue dataset for this domain. We then propose AI-Salesman, a novel framework featuring a dual-stage architecture. For the training stage, we design a Bayesian-supervised reinforcement learning algorithm that learns robust sales strategies from noisy dialogues. For the inference stage, we introduce the Dynamic Outline-Guided Agent (DOGA), which leverages a pre-built script library to provide dynamic, turn-by-turn strategic guidance. Moreover, we design a comprehensive evaluation framework that combines fine-grained metrics for key sales skills with the LLM-as-a-Judge paradigm. Experimental results demonstrate that our proposed AI-Salesman significantly outperforms baseline models in both automatic metrics and comprehensive human evaluations, showcasing its effectiveness in complex persuasive scenarios.