FlowXpert: Context-Aware Flow Embedding for Enhanced Traffic Detection in IoT Network

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

To address the sparsity of traditional features and insufficient semantic modeling caused by dynamic IoT traffic, this paper proposes a context-aware flow embedding method. It discards highly sparse temporal and length-based features and instead incorporates source-host contextual semantic features. A lightweight embedding framework is designed, integrating DBSCAN-based unsupervised clustering with contrastive learning to achieve fine-grained and robust semantic representation of network traffic. Evaluated on the real-world MAWI dataset, the method significantly outperforms multiple state-of-the-art models in detection accuracy, generalization capability, and robustness against noise and adversarial perturbations. Moreover, its low computational overhead enables real-time deployment on resource-constrained IoT devices. This work establishes a novel paradigm for efficient anomaly detection in IoT environments, balancing expressiveness, scalability, and practical deployability.

Technology Category

Application Category

📝 Abstract

In the Internet of Things (IoT) environment, continuous interaction among a large number of devices generates complex and dynamic network traffic, which poses significant challenges to rule-based detection approaches. Machine learning (ML)-based traffic detection technology, capable of identifying anomalous patterns and potential threats within this traffic, serves as a critical component in ensuring network security. This study first identifies a significant issue with widely adopted feature extraction tools (e.g., CICMeterFlow): the extensive use of time- and length-related features leads to high sparsity, which adversely affects model convergence. Furthermore, existing traffic detection methods generally lack an embedding mechanism capable of efficiently and comprehensively capturing the semantic characteristics of network traffic. To address these challenges, we propose a novel feature extraction tool that eliminates traditional time and length features in favor of context-aware semantic features related to the source host, thus improving the generalizability of the model. In addition, we design an embedding training framework that integrates the unsupervised DBSCAN clustering algorithm with a contrastive learning strategy to effectively capture fine-grained semantic representations of traffic. Extensive empirical evaluations are conducted on the real-world Mawi data set to validate the proposed method in terms of detection accuracy, robustness, and generalization. Comparative experiments against several state-of-the-art (SOTA) models demonstrate the superior performance of our approach. Furthermore, we confirm its applicability and deployability in real-time scenarios.

Problem

Research questions and friction points this paper is trying to address.

Addresses high sparsity in IoT traffic features from time/length attributes

Lacks embedding mechanism for semantic network traffic characteristics

Improves detection accuracy and generalization in dynamic IoT environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Context-aware semantic features replacing time/length attributes

DBSCAN clustering integrated with contrastive learning strategy

Unsupervised embedding framework capturing fine-grained traffic semantics

🔎 Similar Papers

No similar papers found.