Convolutions are Competitive with Transformers for Encrypted Traffic Classification with Pre-training

πŸ“… 2025-08-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the inefficiency (O(nΒ²) self-attention complexity) and poor generalization to unseen sequence lengths (due to rigid explicit positional encoding) of Transformer models in encrypted traffic classification, this paper proposes NetConvβ€”a lightweight pre-trained convolutional architecture. Methodologically, NetConv introduces traffic-aware convolutional layers to capture local byte-level patterns; integrates sequential byte gating and windowed byte scoring mechanisms to enhance protocol-specific feature representation; and employs a Continuous Byte Masking (CBM) objective for end-to-end unsupervised pre-training. Evaluated on four standard encrypted traffic classification benchmarks, NetConv achieves an average accuracy improvement of 6.88% over prior methods, while attaining 7.41Γ— higher throughput than state-of-the-art Transformer baselines. The model thus delivers a favorable trade-off among computational efficiency, scalability to variable-length sequences, and classification performance.

Technology Category

Application Category

πŸ“ Abstract
Encrypted traffic classification is vital for modern network management and security. To reduce reliance on handcrafted features and labeled data, recent methods focus on learning generic representations through pre-training on large-scale unlabeled data. However, current pre-trained models face two limitations originating from the adopted Transformer architecture: (1) Limited model efficiency due to the self-attention mechanism with quadratic complexity; (2) Unstable traffic scalability to longer byte sequences, as the explicit positional encodings fail to generalize to input lengths not seen during pre-training. In this paper, we investigate whether convolutions, with linear complexity and implicit positional encoding, are competitive with Transformers in encrypted traffic classification with pre-training. We first conduct a systematic comparison, and observe that convolutions achieve higher efficiency and scalability, with lower classification performance. To address this trade-off, we propose NetConv, a novel pre-trained convolution model for encrypted traffic classification. NetConv employs stacked traffic convolution layers, which enhance the ability to capture localized byte-sequence patterns through window-wise byte scoring and sequence-wise byte gating. We design a continuous byte masking pre-training task to help NetConv learn protocol-specific patterns. Experimental results on four tasks demonstrate that NetConv improves average classification performance by 6.88% and model throughput by 7.41X over existing pre-trained models.
Problem

Research questions and friction points this paper is trying to address.

Improving efficiency in encrypted traffic classification models
Enhancing scalability for longer byte sequences
Reducing reliance on handcrafted features and labeled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

NetConv uses stacked traffic convolution layers
Implements continuous byte masking pre-training
Enhances localized byte-sequence pattern capture
πŸ”Ž Similar Papers
No similar papers found.
C
Chungang Lin
Institute of Computing Technology, Chinese Academy of Sciences, China.
W
Weiyao Zhang
Institute of Computing Technology, Chinese Academy of Sciences, China.
T
Tianyu Zuo
University of Chinese Academy of Sciences, China.
C
Chao Zha
University of Chinese Academy of Sciences, China.
Y
Yilong Jiang
Institute of Computing Technology, Chinese Academy of Sciences, China.
R
Ruiqi Meng
University of Chinese Academy of Sciences, China.
H
Haitong Luo
Institute of Computing Technology, Chinese Academy of Sciences, China.
Xuying Meng
Xuying Meng
Institute of Computing Technology, Chinese Academy of Sciences
Y
Yujun Zhang
Institute of Computing Technology, Chinese Academy of Sciences, China.