🤖 AI Summary
To address the scarcity of high-quality labeled data and the limited, outdated attack types in public network intrusion detection datasets, this paper proposes AutoGraphAD—a novel unsupervised network flow anomaly detection method based on a Heterogeneous Variational Graph Autoencoder (HVGA). AutoGraphAD constructs a heterogeneous graph comprising IP addresses and network connections, and computes a holistic anomaly score by jointly optimizing reconstruction loss, structural preservation loss, and KL divergence. Crucially, it incorporates contrastive learning to enhance the robustness of graph representations and eliminates reliance on downstream classifiers, enabling end-to-end unsupervised training and inference. Evaluated on multiple benchmark datasets, AutoGraphAD achieves state-of-the-art detection performance while accelerating training and inference by approximately 11.8× and 10.3×, respectively—demonstrating strong suitability for real-time deployment.
📝 Abstract
Network Intrusion Detection Systems (NIDS) are essential tools for detecting network attacks and intrusions. While extensive research has explored the use of supervised Machine Learning for attack detection and characterisation, these methods require accurately labelled datasets, which are very costly to obtain. Moreover, existing public datasets have limited and/or outdated attacks, and many of them suffer from mislabelled data. To reduce the reliance on labelled data, we propose AutoGraphAD, a novel unsupervised anomaly detection approach based on a Heterogeneous Variational Graph Autoencoder. AutoGraphAD operates on heterogeneous graphs, made from connection and IP nodes that capture network activity within a time window. The model is trained using unsupervised and contrastive learning, without relying on any labelled data. The reconstruction, structural loss, and KL divergence are then weighted and combined in an anomaly score that is then used for anomaly detection. Overall, AutoGraphAD yields the same, and in some cases better, results than previous unsupervised approaches, such as Anomal-E, but without requiring costly downstream anomaly detectors. As a result, AutoGraphAD achieves around 1.18 orders of magnitude faster training and 1.03 orders of magnitude faster inference, which represents a significant advantage for operational deployment.