🤖 AI Summary
To address two key bottlenecks in encrypted traffic classification—slow inference of pretrained models and poor generalizability due to heavy reliance on labeled data—this paper proposes NetClus. Methodologically, it introduces a clustering-friendly loss function that jointly optimizes classification accuracy and clustering separability in the latent space; defines the Adaptive Separability Index (ASI) to quantify cluster purity and enable online identification of emerging traffic types; and employs knowledge distillation coupled with a lightweight feedforward network for model compression. Its primary contribution is the first framework that co-optimizes latent-space clustering discriminability and classification performance, thereby achieving both high efficiency and open-set recognition capability. Experiments demonstrate that NetClus maintains classification accuracy within a ≤1% drop while accelerating inference by up to 6.2×, significantly enhancing real-time analysis and adaptability to unseen traffic categories.
📝 Abstract
Traffic classification plays a significant role in network service management. The advancement of deep learning has established pretrained models as a robust approach for this task. However, contemporary encrypted traffic classification systems face dual limitations. Firstly, pretrained models typically exhibit large-scale architectures, where their extensive parameterization results in slow inference speeds and high computational latency. Secondly, reliance on labeled data for fine-tuning restricts these models to predefined supervised classes, creating a bottleneck when novel traffic types emerge in the evolving Internet landscape. To address these challenges, we propose NetClus, a novel framework integrating pretrained models with distillation-enhanced clustering acceleration. During fine-tuning, NetClus first introduces a cluster-friendly loss to jointly reshape the latent space for both classification and clustering. With the fine-tuned model, it distills the model into a lightweight Feed-Forward Neural Network model to retain semantics. During inference, NetClus performs heuristic merge with near-linear runtime, and valid the cluster purity with newly proposed metrics ASI to identify emergent traffic types while expediting classification. Benchmarked against existing pretrained methods, NetClus achieves up to 6.2x acceleration while maintaining classification degradation below 1%.