Send to which account? Evaluation of an LLM-based Scambaiting System

📅 2025-09-10

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

Traditional detection and response mechanisms are ineffective against large-scale, generative-AI-driven phishing scams, as they fail to disrupt the underlying financial infrastructure—such as mule accounts and cryptocurrency wallets—that enables fraud. Method: We propose the first large-scale, real-world deployed LLM-powered conversational honeypot system, which engages scammers in human-like dialogues to proactively elicit sensitive financial information. Our approach introduces a rule-augmented dialogue management framework and formally defines two core evaluation metrics: *information leakage rate* and *human acceptance rate*. Contribution/Results: Over a five-month field deployment, the system conducted 2,600+ interactions and collected 18,000 messages, achieving a 32% information leakage rate and a 70% human acceptance rate. It successfully identified critical accounts involved in fund transfers, empirically validating the feasibility and effectiveness of proactive anti-fraud paradigms.

Technology Category

Application Category

📝 Abstract

Scammers are increasingly harnessing generative AI(GenAI) technologies to produce convincing phishing content at scale, amplifying financial fraud and undermining public trust. While conventional defenses, such as detection algorithms, user training, and reactive takedown efforts remain important, they often fall short in dismantling the infrastructure scammers depend on, including mule bank accounts and cryptocurrency wallets. To bridge this gap, a proactive and emerging strategy involves using conversational honeypots to engage scammers and extract actionable threat intelligence. This paper presents the first large-scale, real-world evaluation of a scambaiting system powered by large language models (LLMs). Over a five-month deployment, the system initiated over 2,600 engagements with actual scammers, resulting in a dataset of more than 18,700 messages. It achieved an Information Disclosure Rate (IDR) of approximately 32%, successfully extracting sensitive financial information such as mule accounts. Additionally, the system maintained a Human Acceptance Rate (HAR) of around 70%, indicating strong alignment between LLM-generated responses and human operator preferences. Alongside these successes, our analysis reveals key operational challenges. In particular, the system struggled with engagement takeoff: only 48.7% of scammers responded to the initial seed message sent by defenders. These findings highlight the need for further refinement and provide actionable insights for advancing the design of automated scambaiting systems.

Problem

Research questions and friction points this paper is trying to address.

Scammers use GenAI for phishing content increasing financial fraud

Existing defenses fail to dismantle scam infrastructure like mule accounts

Need proactive strategies using conversational honeypots to extract threat intelligence

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-powered conversational honeypots engage scammers proactively

Extracts sensitive financial information like mule accounts

Achieves 32% information disclosure rate from scammers

🔎 Similar Papers

No similar papers found.

JPMorgan Chase

Jersey City, NJ, United States / Columbus, OH, United States

AI Research Scientist, Language - Monetization GenAI