BERTector: Intrusion Detection Based on Joint-Dataset Learning

📅 2025-08-14

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

To address poor generalizability and weak robustness in intrusion detection caused by network traffic heterogeneity and diverse attack patterns, this paper proposes BERTector—a novel framework grounded in pre-trained language models. First, it introduces the NSS-Tokenizer, a semantic-aware tokenization module tailored for network traffic. Second, it establishes a joint multi-source dataset training paradigm that supports both supervised fine-tuning and collaborative optimization across heterogeneous data. Third, it pioneers the integration of Low-Rank Adaptation (LoRA) into intrusion detection, markedly improving training efficiency and parameter update stability. Evaluated on multiple benchmark datasets, BERTector achieves state-of-the-art detection accuracy while demonstrating superior cross-domain generalization and robustness against adversarial perturbations. This work establishes a scalable, pre-trained language model–based paradigm for network security analysis.

Technology Category

Application Category

📝 Abstract

Intrusion detection systems (IDS) are facing challenges in generalization and robustness due to the heterogeneity of network traffic and the diversity of attack patterns. To address this issue, we propose a new joint-dataset training paradigm for IDS and propose a scalable BERTector framework based on BERT. BERTector integrates three key components: NSS-Tokenizer for traffic-aware semantic tokenization, supervised fine-tuning with a hybrid dataset, and low-rank adaptation (LoRA) for efficient training. Extensive experiments show that BERTector achieves state-of-the-art detection accuracy, strong cross-dataset generalization capabilities, and excellent robustness to adversarial perturbations. This work establishes a unified and efficient solution for modern IDS in complex and dynamic network environments.

Problem

Research questions and friction points this paper is trying to address.

Improving generalization in intrusion detection systems

Enhancing robustness against diverse attack patterns

Addressing network traffic heterogeneity challenges

Innovation

Methods, ideas, or system contributions that make the work stand out.

Joint-dataset training paradigm for IDS

BERTector framework with NSS-Tokenizer

LoRA for efficient model training

🔎 Similar Papers

No similar papers found.