Are We There Yet? Unraveling the State-of-the-Art Graph Network Intrusion Detection Systems

📅 2025-03-26

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

This paper addresses three critical challenges in graph neural network–based intrusion detection systems (GIDS): poor reproducibility, weak robustness, and inconsistent evaluation. We conduct a systematic benchmark across four network traffic datasets—including a newly introduced large-scale enterprise dataset—and reveal, for the first time, that mainstream GIDS exhibit irreproducible results under both false positive rate analysis and adversarial attacks. To address these issues, we propose a standardized evaluation framework that quantitatively analyzes the impact of data scale, graph structural representation, and implementation details on model performance; we further identify three key bottlenecks: data bias, inconsistent feature engineering, and missing training configurations. Our contributions include: (1) the first multi-dimensional reproducibility benchmark specifically designed for GIDS; (2) a formal adversarial robustness evaluation protocol; and (3) a practical guideline for developing reproducible GIDS.

Technology Category

Application Category

📝 Abstract

Network Intrusion Detection Systems (NIDS) are vital for ensuring enterprise security. Recently, Graph-based NIDS (GIDS) have attracted considerable attention because of their capability to effectively capture the complex relationships within the graph structures of data communications. Despite their promise, the reproducibility and replicability of these GIDS remain largely unexplored, posing challenges for developing reliable and robust detection systems. This study bridges this gap by designing a systematic approach to evaluate state-of-the-art GIDS, which includes critically assessing, extending, and clarifying the findings of these systems. We further assess the robustness of GIDS under adversarial attacks. Evaluations were conducted on three public datasets as well as a newly collected large-scale enterprise dataset. Our findings reveal significant performance discrepancies, highlighting challenges related to dataset scale, model inputs, and implementation settings. We demonstrate difficulties in reproducing and replicating results, particularly concerning false positive rates and robustness against adversarial attacks. This work provides valuable insights and recommendations for future research, emphasizing the importance of rigorous reproduction and replication studies in developing robust and generalizable GIDS solutions.

Problem

Research questions and friction points this paper is trying to address.

Evaluating reproducibility and replicability of Graph-based NIDS.

Assessing robustness of GIDS under adversarial attacks.

Identifying performance discrepancies in dataset scale and model inputs.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic evaluation of Graph-based NIDS

Assessing robustness under adversarial attacks

Using multiple datasets for comprehensive analysis

🔎 Similar Papers

No similar papers found.