🤖 AI Summary
Existing byte-level modeling in encrypted traffic classification suffers from rigid granularity and fails to capture cross-field semantic correlations. To address this, we propose a multi-view heterogeneous traffic graph modeling framework: raw bytes are aggregated into multi-granularity traffic units (e.g., headers, payloads), forming heterogeneous graph structures with diverse node types and relation-aware edges across multiple views. We further introduce a multi-task contrastive learning strategy that jointly optimizes flow-level classification and unit-level reconstruction, enhancing representation robustness. Our approach overcomes the limitations of conventional single-granularity analysis. Evaluated on the ISCX and CIC-IoT datasets, it achieves state-of-the-art performance—significantly outperforming over twenty SOTA methods in both packet-level and flow-level classification accuracy—while demonstrating strong generalization capability.
📝 Abstract
With the growing significance of network security, the classification of encrypted traffic has emerged as an urgent challenge. Traditional byte-based traffic analysis methods are constrained by the rigid granularity of information and fail to fully exploit the diverse correlations between bytes. To address these limitations, this paper introduces MH-Net, a novel approach for classifying network traffic that leverages multi-view heterogeneous traffic graphs to model the intricate relationships between traffic bytes. The essence of MH-Net lies in aggregating varying numbers of traffic bits into multiple types of traffic units, thereby constructing multi-view traffic graphs with diverse information granularities. By accounting for different types of byte correlations, such as header-payload relationships, MH-Net further endows the traffic graph with heterogeneity, significantly enhancing model performance. Notably, we employ contrastive learning in a multi-task manner to strengthen the robustness of the learned traffic unit representations. Experiments conducted on the ISCX and CIC-IoT datasets for both the packet-level and flow-level traffic classification tasks demonstrate that MH-Net achieves the best overall performance compared to dozens of SOTA methods.