SoK: Large Language Model-Generated Textual Phishing Campaigns End-to-End Analysis of Generation, Characteristics, and Detection

📅 2025-08-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Phishing attacks have evolved into highly evasive, scalable “phishing-as-a-service” campaigns fueled by large language models (LLMs), yet their end-to-end lifecycle remains understudied. This paper introduces GenCharDef—a unified framework that systematically deconstructs LLM-driven textual phishing along three dimensions: generative mechanisms, feature representations, and defense strategies. Grounded in a systematic mapping of knowledge (SoK) and integrating adversarial text generation, NLP, and cybersecurity analysis, GenCharDef enables end-to-end characterization of LLM-generated phishing content. It identifies distinct technical pathways and feature patterns, revealing fundamental differences from traditional phishing in methodology, security properties, and evaluation paradigms. The framework establishes a rigorous analytical foundation for developing AI-native detection mechanisms and resilient defense architectures. (138 words)

Technology Category

Application Category

📝 Abstract
Phishing is a pervasive form of social engineering in which attackers impersonate trusted entities to steal information or induce harmful actions. Text-based phishing dominates for its low cost, scalability, and concealability, advantages recently amplified by large language models (LLMs) that enable ``Phishing-as-a-Service'' attacks at scale within minutes. Despite the growing research into LLM-facilitated phishing attacks, consolidated systematic research on the phishing attack life cycle remains scarce. In this work, we present the first systematization of knowledge (SoK) on LLM-generated phishing, offering an end-to-end analysis that spans generation techniques, attack features, and mitigation strategies. We introduce Generation-Characterization-Defense (GenCharDef), which systematizes the ways in which LLM-generated phishing differs from traditional phishing across methodologies, security perspectives, data dependencies, and evaluation practices. This framework highlights unique challenges of LLM-driven phishing, providing a coherent foundation for understanding the evolving threat landscape and guiding the design of more resilient defenses.
Problem

Research questions and friction points this paper is trying to address.

Analyzing LLM-generated phishing campaigns end-to-end
Systematizing differences from traditional phishing attacks
Providing framework for understanding and defense strategies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematizes LLM-generated phishing lifecycle analysis
Introduces GenCharDef framework for threat characterization
Provides end-to-end generation and defense strategies
🔎 Similar Papers
No similar papers found.
F
Fengchao Chen
The Faculty of Information Technology, CSIRO Data61 Melbourne Australia
T
Tingmin Wu
The Faculty of Information Technology, CSIRO Data61 Melbourne Australia
V
Van Nguyen
The Faculty of Information Technology, CSIRO Data61 Melbourne Australia
Carsten Rudolph
Carsten Rudolph
Monash University, Melbourne, Australia
SecurityCryptographic ProtocolsTrusted ComputingNetwork Security