Verification Required: The Impact of Information Credibility on AI Persuasion

📅 2026-02-01

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work addresses the challenge of modeling AI persuasion in realistic scenarios involving a mix of verifiable and unverifiable information. The authors propose MixTalk, a novel game-theoretic framework that formulates strategic communication by integrating both types of information: a sender LLM strategically combines statements to convey private information, while a receiver LLM infers the true state under a limited verification budget. To derive robust strategies, they introduce Tournament Oracle Policy Distillation (TOPD), which distills effective verification and inference policies from multi-agent interaction logs. Large-scale tournament experiments reveal significant deficiencies in current LLMs’ ability to perform credibility-aware reasoning, whereas TOPD substantially enhances the receiver’s robustness against persuasive manipulation.

Technology Category

Application Category

📝 Abstract

Agents powered by large language models (LLMs) are increasingly deployed in settings where communication shapes high-stakes decisions, making a principled understanding of strategic communication essential. Prior work largely studies either unverifiable cheap-talk or fully verifiable disclosure, failing to capture realistic domains in which information has probabilistic credibility. We introduce MixTalk, a strategic communication game for LLM-to-LLM interaction that models information credibility. In MixTalk, a sender agent strategically combines verifiable and unverifiable claims to communicate private information, while a receiver agent allocates a limited budget to costly verification and infers the underlying state from prior beliefs, claims, and verification outcomes. We evaluate state-of-the-art LLM agents in large-scale tournaments across three realistic deployment settings, revealing their strengths and limitations in reasoning about information credibility and the explicit behavior that shapes these interactions. Finally, we propose Tournament Oracle Policy Distillation (TOPD), an offline method that distills tournament oracle policy from interaction logs and deploys it in-context at inference time. Our results show that TOPD significantly improves receiver robustness to persuasion.

Problem

Research questions and friction points this paper is trying to address.

information credibility

strategic communication

verifiable claims

LLM agents

persuasion

Innovation

Methods, ideas, or system contributions that make the work stand out.

MixTalk

information credibility

strategic communication