🤖 AI Summary
Large language models (LLMs) deployed in email security systems are vulnerable to coordinated multi-vector attacks—including prompt injection, text refinement, and cross-lingual adversarial perturbations—yet no native LLM-based detection framework exists for phishing emails. Method: We propose the first LLM-native multi-vector phishing detection framework, integrating GPT-4o, Claude Sonnet 4, and Grok-3 with systematic prompt engineering, cross-lingual semantic alignment, and rigorous adversarial testing. Contribution/Results: Our framework achieves >90% phishing detection accuracy and, for the first time, empirically characterizes the robustness boundaries and composable vulnerabilities of LLMs under realistic phishing scenarios. Experiments reveal critical failure modes in existing LLM-based detectors under coordinated prompt injection and multilingual attacks. The framework establishes a reproducible evaluation benchmark and a principled defense paradigm for secure LLM deployment in email security.
📝 Abstract
Email phishing is one of the most prevalent and globally consequential vectors of cyber intrusion. As systems increasingly deploy Large Language Models (LLMs) applications, these systems face evolving phishing email threats that exploit their fundamental architectures. Current LLMs require substantial hardening before deployment in email security systems, particularly against coordinated multi-vector attacks that exploit architectural vulnerabilities. This paper proposes LLMPEA, an LLM-based framework to detect phishing email attacks across multiple attack vectors, including prompt injection, text refinement, and multilingual attacks. We evaluate three frontier LLMs (e.g., GPT-4o, Claude Sonnet 4, and Grok-3) and comprehensive prompting design to assess their feasibility, robustness, and limitations against phishing email attacks. Our empirical analysis reveals that LLMs can detect the phishing email over 90% accuracy while we also highlight that LLM-based phishing email detection systems could be exploited by adversarial attack, prompt injection, and multilingual attacks. Our findings provide critical insights for LLM-based phishing detection in real-world settings where attackers exploit multiple vulnerabilities in combination.