Convolutions are Competitive with Transformers for Encrypted Traffic Classification with Pre-training

📅 2025-08-03

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

To address the inefficiency (O(n²) self-attention complexity) and poor generalization to unseen sequence lengths (due to rigid explicit positional encoding) of Transformer models in encrypted traffic classification, this paper proposes NetConv—a lightweight pre-trained convolutional architecture. Methodologically, NetConv introduces traffic-aware convolutional layers to capture local byte-level patterns; integrates sequential byte gating and windowed byte scoring mechanisms to enhance protocol-specific feature representation; and employs a Continuous Byte Masking (CBM) objective for end-to-end unsupervised pre-training. Evaluated on four standard encrypted traffic classification benchmarks, NetConv achieves an average accuracy improvement of 6.88% over prior methods, while attaining 7.41× higher throughput than state-of-the-art Transformer baselines. The model thus delivers a favorable trade-off among computational efficiency, scalability to variable-length sequences, and classification performance.

Technology Category

Application Category

📝 Abstract

Encrypted traffic classification is vital for modern network management and security. To reduce reliance on handcrafted features and labeled data, recent methods focus on learning generic representations through pre-training on large-scale unlabeled data. However, current pre-trained models face two limitations originating from the adopted Transformer architecture: (1) Limited model efficiency due to the self-attention mechanism with quadratic complexity; (2) Unstable traffic scalability to longer byte sequences, as the explicit positional encodings fail to generalize to input lengths not seen during pre-training. In this paper, we investigate whether convolutions, with linear complexity and implicit positional encoding, are competitive with Transformers in encrypted traffic classification with pre-training. We first conduct a systematic comparison, and observe that convolutions achieve higher efficiency and scalability, with lower classification performance. To address this trade-off, we propose NetConv, a novel pre-trained convolution model for encrypted traffic classification. NetConv employs stacked traffic convolution layers, which enhance the ability to capture localized byte-sequence patterns through window-wise byte scoring and sequence-wise byte gating. We design a continuous byte masking pre-training task to help NetConv learn protocol-specific patterns. Experimental results on four tasks demonstrate that NetConv improves average classification performance by 6.88% and model throughput by 7.41X over existing pre-trained models.

Problem

Research questions and friction points this paper is trying to address.

Improving efficiency in encrypted traffic classification models

Enhancing scalability for longer byte sequences

Reducing reliance on handcrafted features and labeled data

Innovation

Methods, ideas, or system contributions that make the work stand out.

NetConv uses stacked traffic convolution layers

Implements continuous byte masking pre-training

Enhances localized byte-sequence pattern capture

🔎 Similar Papers

No similar papers found.