SoK: Decoding the Enigma of Encrypted Network Traffic Classifiers

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Modern encryption protocols (e.g., TLS 1.3) have rendered traditional network traffic classification (NTC) ineffective, while existing ML-based NTC studies suffer from outdated datasets, flawed assumptions, and methodological biases. Method: We conduct a systematic mapping study (SoK), establishing a taxonomy of NTC design choices and benchmarking practices; we perform the first large-scale empirical validation—348 feature-masking ablation experiments—to rigorously test foundational assumptions, exposing critical flaws such as training on plaintext traffic and overfitting to obsolete datasets. Contribution/Results: We demonstrate that state-of-the-art classifiers consistently fail on real-world encrypted traffic; we propose a reproducible best-practice guideline for NTC evaluation; and we delineate three key future directions: (1) focusing on realistic encrypted scenarios, (2) enabling dynamic dataset curation and updating, and (3) adopting causality-aware, domain-informed feature engineering.

Technology Category

Application Category

📝 Abstract
The adoption of modern encryption protocols such as TLS 1.3 has significantly challenged traditional network traffic classification (NTC) methods. As a consequence, researchers are increasingly turning to machine learning (ML) approaches to overcome these obstacles. In this paper, we comprehensively analyze ML-based NTC studies, developing a taxonomy of their design choices, benchmarking suites, and prevalent assumptions impacting classifier performance. Through this systematization, we demonstrate widespread reliance on outdated datasets, oversights in design choices, and the consequences of unsubstantiated assumptions. Our evaluation reveals that the majority of proposed encrypted traffic classifiers have mistakenly utilized unencrypted traffic due to the use of legacy datasets. Furthermore, by conducting 348 feature occlusion experiments on state-of-the-art classifiers, we show how oversights in NTC design choices lead to overfitting, and validate or refute prevailing assumptions with empirical evidence. By highlighting lessons learned, we offer strategic insights, identify emerging research directions, and recommend best practices to support the development of real-world applicable NTC methodologies.
Problem

Research questions and friction points this paper is trying to address.

Analyzing ML-based encrypted traffic classification challenges
Identifying outdated datasets and design choice oversights
Evaluating classifier performance with empirical evidence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine learning for encrypted traffic classification
Taxonomy of design choices and benchmarks
Feature occlusion to validate assumptions
🔎 Similar Papers
No similar papers found.