🤖 AI Summary
This work addresses the underutilization of cross-task structural similarities in repeated Bayesian persuasion settings by proposing a unified algorithmic framework that integrates meta-learning with online learning to optimize signaling strategies under both full-feedback and bandit-feedback regimes. The approach provides the first theoretical guarantees for repeated persuasion tasks, achieving improved regret bounds over existing methods under a task similarity assumption while maintaining robust performance against arbitrary task sequences in single-round interactions. Theoretical analysis demonstrates a significant improvement in regret rates, and numerical experiments further corroborate the practical benefits of meta-learning in enhancing persuasion efficiency.
📝 Abstract
Classical Bayesian persuasion studies how a sender influences receivers through carefully designed signaling policies within a single strategic interaction. In many real-world environments, such interactions are repeated across multiple games, creating opportunities to exploit structural similarity across tasks. In this work, we introduce Meta-Persuasion algorithms, establishing the first line of theoretical results for both full-feedback and bandit-feedback settings in the Online Bayesian Persuasion (OBP) and Markov Persuasion Process (MPP) frameworks. We show that our proposed meta-persuasion algorithms achieve provably sharper regret rates under natural notions of task similarity, improving upon the best-known convergence rates for both OBP and MPP. At the same time, they recover the standard single-game guarantees when the sequence of games is picked arbitrarily. Finally, we complement our theoretical analysis with numerical experiments that highlight our regret improvements and the benefits of meta-learning in repeated persuasion environments.