🤖 AI Summary
This study addresses the challenge of telephone fraud prevention by proposing an adversarial dialogue framework powered by large language models (LLMs), enabling “counter-phishing” through realistic scam scenario simulation. Methodologically, it introduces a novel chain-of-thought–driven strategy emergence mechanism that requires no explicit optimization, implemented via a two-tiered prompting architecture ensuring both demographic authenticity and strategic coherence of agent roles. Experiments using GPT-4 and DeepSeek across 3,200 adversarial dialogues successfully replicate human anti-fraud behavioral patterns. Evaluation across three dimensions—cognitive plausibility, quantitative efficacy, and content specificity—demonstrates the framework’s effectiveness in actively delaying and disrupting fraudulent processes. Results show GPT-4 achieves superior naturalness and persona fidelity, while DeepSeek excels in interaction longevity. The framework exhibits strong scalability and operational viability for real-world anti-fraud deployment.
📝 Abstract
We present"Bot Wars,"a framework using Large Language Models (LLMs) scam-baiters to counter phone scams through simulated adversarial dialogues. Our key contribution is a formal foundation for strategy emergence through chain-of-thought reasoning without explicit optimization. Through a novel two-layer prompt architecture, our framework enables LLMs to craft demographically authentic victim personas while maintaining strategic coherence. We evaluate our approach using a dataset of 3,200 scam dialogues validated against 179 hours of human scam-baiting interactions, demonstrating its effectiveness in capturing complex adversarial dynamics. Our systematic evaluation through cognitive, quantitative, and content-specific metrics shows that GPT-4 excels in dialogue naturalness and persona authenticity, while Deepseek demonstrates superior engagement sustainability.