Federated Learning in the Wild: A Comparative Study for Cybersecurity under Non-IID and Unbalanced Settings

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

In network security, acquiring high-quality training data under strict privacy constraints remains challenging. Method: This paper investigates the performance of federated learning (FL) for DDoS intrusion detection under realistic non-IID and imbalanced data conditions. Leveraging a Kubernetes-based containerized testbed, we construct a realistic network attack dataset and systematically evaluate FedAvg alongside multiple advanced FL algorithms across convergence speed, communication overhead, computational efficiency, and model accuracy. Contribution/Results: To the best of our knowledge, this is the first comprehensive empirical comparison of FL methods on practical, non-IID, and class-imbalanced intrusion detection tasks. Experimental results demonstrate that several optimized FL algorithms significantly improve both convergence rate and detection accuracy while preserving data locality and privacy compliance. This work provides empirical evidence and methodological guidance for developing efficient, privacy-preserving distributed intrusion detection systems.

Technology Category

Application Category

📝 Abstract

Machine Learning (ML) techniques have shown strong potential for network traffic analysis; however, their effectiveness depends on access to representative, up-to-date datasets, which is limited in cybersecurity due to privacy and data-sharing restrictions. To address this challenge, Federated Learning (FL) has recently emerged as a novel paradigm that enables collaborative training of ML models across multiple clients while ensuring that sensitive data remains local. Nevertheless, Federated Averaging (FedAvg), the canonical FL algorithm, has proven poor convergence in heterogeneous environments where data distributions are non-independent and identically distributed (i.i.d.) and client datasets are unbalanced, conditions frequently observed in cybersecurity contexts. To overcome these challenges, several alternative FL strategies have been developed, yet their applicability to network intrusion detection remains insufficiently explored. This study systematically reviews and evaluates a range of FL methods in the context of intrusion detection for DDoS attacks. Using a dataset of network attacks within a Kubernetes-based testbed, we assess convergence efficiency, computational overhead, bandwidth consumption, and model accuracy. To the best of our knowledge, this is the first comparative analysis of FL algorithms for intrusion detection under realistic non-i.i.d. and unbalanced settings, providing new insights for the design of robust, privacypreserving network security solutions.

Problem

Research questions and friction points this paper is trying to address.

Evaluating federated learning methods for intrusion detection

Addressing poor convergence in non-IID, unbalanced cybersecurity data

Assessing FL algorithm performance under realistic network attack conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluated federated learning methods for intrusion detection

Compared algorithms under non-iID unbalanced cybersecurity settings

Assessed convergence efficiency accuracy and resource consumption

🔎 Similar Papers

Federated Learning in Adversarial Environments: Testbed Design and Poisoning Resilience in Cybersecurity