Packet Inspection Transformer: A Self-Supervised Journey to Unseen Malware Detection with Few Samples

📅 2024-09-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the reliance on large-scale labeled data and poor generalization in zero-day malware detection, this paper proposes a novel deep packet inspection paradigm integrating self-supervised pretraining with few-shot adaptation. Methodologically, it introduces Transformer-based modeling of raw byte-level network packets for the first time, employing masked language modeling (MLM) for unsupervised representation learning, followed by prototype networks for few-shot threat classification. Key contributions include: (1) end-to-end semantic learning directly from byte sequences; (2) strong generalization to unseen malware families without extensive labeling; and (3) state-of-the-art accuracy—94.76% on UNSW-NB15 and 83.25% on CIC-IoT23—significantly outperforming supervised baselines while maintaining robustness under extreme data scarcity (1–5 samples per class).

Technology Category

Application Category

📝 Abstract
As networks continue to expand and become more interconnected, the need for novel malware detection methods becomes more pronounced. Traditional security measures are increasingly inadequate against the sophistication of modern cyber attacks. Deep Packet Inspection (DPI) has been pivotal in enhancing network security, offering an in-depth analysis of network traffic that surpasses conventional monitoring techniques. DPI not only examines the metadata of network packets, but also dives into the actual content being carried within the packet payloads, providing a comprehensive view of the data flowing through networks. While the integration of advanced deep learning techniques with DPI has introduced modern methodologies into malware detection and network traffic classification, state-of-the-art supervised learning approaches are limited by their reliance on large amounts of annotated data and their inability to generalize to novel, unseen malware threats. To address these limitations, this paper leverages the recent advancements in self-supervised learning (SSL) and few-shot learning (FSL). Our proposed self-supervised approach trains a transformer via SSL to learn the embedding of packet content, including payload, from vast amounts of unlabeled data by masking portions of packets, leading to a learned representation that generalizes to various downstream tasks. Once the representation is extracted from the packets, they are used to train a malware detection algorithm. The representation obtained from the transformer is then used to adapt the malware detector to novel types of attacks using few-shot learning approaches. Our experimental results demonstrate that our method achieves classification accuracies of up to 94.76% on the UNSW-NB15 dataset and 83.25% on the CIC-IoT23 dataset.
Problem

Research questions and friction points this paper is trying to address.

Develop self-supervised malware detection
Enhance detection with few-shot learning
Address unseen malware generalization challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised learning for malware detection
Transformer models for packet analysis
Few-shot learning for attack adaptation
🔎 Similar Papers
Kyle Stein
Kyle Stein
Ph.D. Candidate, University of West Florida
Deep LearningComputer VisionCybersecurity
A
A. Mahyari
Department of Intelligent Systems and Robotics, University of West Florida, Pensacola, FL, USA; Florida Institute For Human and Machine Cognition (IHMC), Pensacola, FL, USA
G
Guillermo A. Francia
Center for Cybersecurity, University of West Florida, Pensacola, FL, USA
E
Eman El-Sheikh
Center for Cybersecurity, University of West Florida, Pensacola, FL, USA