🤖 AI Summary
Existing NetFlow datasets commonly lack critical temporal features—such as inter-packet arrival intervals and flow duration—limiting both the detection accuracy and timeliness of machine learning–based network intrusion detection systems (NIDS). To address this, we construct and publicly release the first NetFlow dataset embedded with fine-grained temporal features, and propose a Time–Frequency Signal Representation (TFSP) method that integrates short-time Fourier transform (STFT) with systematic temporal analysis to model the full-dimensional temporal distribution of NetFlow for the first time. Experiments reveal distinct, class-separable patterns of multiple attack types in the time–frequency domain. The extracted time-aware features improve ML-based detection accuracy against stealthy attacks by 12.7% and enable early attack identification up to 3.8 seconds sooner. This work bridges dual gaps in the NIDS field: the absence of temporally rich NetFlow data and a principled temporal modeling paradigm.
📝 Abstract
This paper investigates the temporal analysis of NetFlow datasets for machine learning (ML)-based network intrusion detection systems (NIDS). Although many previous studies have highlighted the critical role of temporal features, such as inter-packet arrival time and flow length/duration, in NIDS, the currently available NetFlow datasets for NIDS lack these temporal features. This study addresses this gap by creating and making publicly available a set of NetFlow datasets that incorporate these temporal features [1]. With these temporal features, we provide a comprehensive temporal analysis of NetFlow datasets by examining the distribution of various features over time and presenting time-series representations of NetFlow features. This temporal analysis has not been previously provided in the existing literature. We also borrowed an idea from signal processing, time frequency analysis, and tested it to see how different the time frequency signal presentations (TFSPs) are for various attacks. The results indicate that many attacks have unique patterns, which could help ML models to identify them more easily.