A Lightweight Defense Mechanism against Next Generation of Phishing Emails using Distilled Attention-Augmented BiLSTM

📅 2026-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge posed by highly realistic phishing emails generated by large language models (LLMs), which often evade traditional text-based detection systems. To this end, the authors propose a lightweight phishing detection method based on knowledge distillation, wherein a MobileBERT model is distilled into an attention-augmented BiLSTM architecture with only 4.5 million parameters. The resulting model enables real-time, privacy-preserving detection at both endpoint and gateway levels without requiring hardware acceleration. By incorporating a multi-head attention mechanism and training on a hybrid dataset that includes LLM-generated phishing samples, the model achieves competitive performance: under five evaluation protocols, it incurs only a 1–2.5 point drop in weighted F1 score compared to state-of-the-art Transformer baselines, while offering 80–95% faster inference and a 95–99% reduction in model size.

Technology Category

Application Category

📝 Abstract
The current generation of large language models produces sophisticated social-engineering content that bypasses standard text screening systems in business communication platforms. Our proposed solution for mail gateway and endpoint deception detection operates in a privacy-protective manner while handling the performance requirements of network and mobile security systems. The MobileBERT teacher receives fine-tuning before its transformation into a BiLSTM model with multi-head attention which maintains semantic discrimination only with 4.5 million parameters. The hybrid dataset contains human-written messages together with LLM-generated paraphrases that use masking techniques and personalization methods to enhance modern attack resistance. The evaluation system uses five testing protocols which include human-only and LLM-only tests and two cross-distribution transfer tests and a production-like mixed traffic test to assess performance in native environments and across different distribution types and combined traffic scenarios. The distilled model maintains a weighted-F1 score difference of 1-2.5 points compared to the mixture split results of strong transformer baselines including ModernBERT, DeBERTaV3-base, T5-base, DeepSeek-R1 Distill Qwen-1.5B and Phi-4 mini while achieving 80-95\% faster inference times and 95-99\% smaller model sizes. The system demonstrates excellent performance in terms of accuracy and latency while maintaining a compact size which enables real-time filtering without acceleration hardware and supports policy-based management. The paper examines system performance under high traffic conditions and security measures for privacy protection and implementation methods for operational deployment.
Problem

Research questions and friction points this paper is trying to address.

phishing emails
large language models
social engineering
email security
text screening
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Distillation
Attention-Augmented BiLSTM
Phishing Email Detection
Lightweight Model
Cross-Distribution Robustness
🔎 Similar Papers
No similar papers found.
M
Morteza Eskandarian
Canadian Institute for Cybersecurity, Faculty of Computer Science, University of New Brunswick, Fredericton, NB, Canada
Mahdi Rabbani
Mahdi Rabbani
Research Scientist, Canadian Institute for Cybersecurity, UNB | Dalhousie University
AI for CybersecurityKnowledge DistillationGraph Neural NetworksMalware Analysis
A
Arun Kaniyamattam
Canadian Institute for Cybersecurity, Faculty of Computer Science, University of New Brunswick, Fredericton, NB, Canada
F
Fatemeh Nejati
Canadian Institute for Cybersecurity, Faculty of Computer Science, University of New Brunswick, Fredericton, NB, Canada
M
Mansur Mirani
Mastercard Vancouver Tech Hub, Vancouver, British Columbia, Canada
G
Gunjan Piya
Mastercard Vancouver Tech Hub, Vancouver, British Columbia, Canada
I
Igor Opushnyev
Mastercard Vancouver Tech Hub, Vancouver, British Columbia, Canada
Ali A. Ghorbani
Ali A. Ghorbani
Professor and Canada Research Chair in Cybersecurity
CybersecurityMachine Learning
Sajjad Dadkhah
Sajjad Dadkhah
Canada Mastercard IoT Research Chair | Assistant Professor | Interim Associate Director at CIC,UNB
CybersecurityDigital multimedia securityNLPIoT securityML security