🤖 AI Summary
To address performance degradation in network intrusion detection caused by concept drift and severe class imbalance—particularly the high annotation cost for rare attacks—this paper proposes a low-labeling-cost continual adaptive framework. Our method innovatively integrates density-aware active sampling with conditional generative adversarial network (cGAN)-driven data augmentation to jointly model evolving data distributions and prioritize highly informative unlabeled samples. Coupled with online incremental learning and density estimation, the framework enables accurate detection of rare attacks—including Infiltration, Web Attack, and FTP-BruteForce. Evaluated on the CIC-IDS 2018 dataset, our approach improves the overall F1-score from 0.60 to 0.86, while achieving F1-scores of 0.30, 0.50, and 0.71 for the three rare attack classes, respectively—substantially outperforming state-of-the-art methods.
📝 Abstract
Machine learning has shown promise in network intrusion detection systems, yet its performance often degrades due to concept drift and imbalanced data. These challenges are compounded by the labor-intensive process of labeling network traffic, especially when dealing with evolving and rare attack types, which makes selecting the right data for adaptation difficult. To address these issues, we propose a generative active adaptation framework that minimizes labeling effort while enhancing model robustness. Our approach employs density-aware active sampling to identify the most informative samples for annotation and leverages deep generative models to synthesize diverse samples, thereby augmenting the training set and mitigating the effects of concept drift. We evaluate our end-to-end framework on both simulated IDS data and a real-world ISP dataset, demonstrating significant improvements in intrusion detection performance. Our method boosts the overall F1-score from 0.60 (without adaptation) to 0.86. Rare attacks such as Infiltration, Web Attack, and FTP-BruteForce, which originally achieve F1 scores of 0.001, 0.04, and 0.00, improve to 0.30, 0.50, and 0.71, respectively, with generative active adaptation in the CIC-IDS 2018 dataset. Our framework effectively enhances rare attack detection while reducing labeling costs, making it a scalable and adaptive solution for real-world intrusion detection.