FedP3E: Privacy-Preserving Prototype Exchange for Non-IID IoT Malware Detection in Cross-Silo Federated Learning

📅 2025-07-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of non-IID data, severe class imbalance, and privacy preservation in cross-institutional federated learning for IoT malware detection, this paper proposes a privacy-preserving class-prototype exchange framework. Clients construct class-level prototypes using Gaussian Mixture Models (GMMs) and inject differentially private Gaussian noise before uploading them to the server. The server aggregates these noisy prototypes and augments rare-class representations via SMOTE, enabling decentralized collaborative training. This approach significantly reduces communication overhead while mitigating data heterogeneity and long-tail distribution issues. Experiments on the N-BaIoT dataset demonstrate that, compared to baseline methods, the proposed framework achieves an average 12.6% improvement in F1-score and a 9.3% gain in detection accuracy across diverse non-IID and highly imbalanced scenarios. The method exhibits strong robustness and practical applicability in real-world federated IoT security settings.

Technology Category

Application Category

📝 Abstract
As IoT ecosystems continue to expand across critical sectors, they have become prominent targets for increasingly sophisticated and large-scale malware attacks. The evolving threat landscape, combined with the sensitive nature of IoT-generated data, demands detection frameworks that are both privacy-preserving and resilient to data heterogeneity. Federated Learning (FL) offers a promising solution by enabling decentralized model training without exposing raw data. However, standard FL algorithms such as FedAvg and FedProx often fall short in real-world deployments characterized by class imbalance and non-IID data distributions -- particularly in the presence of rare or disjoint malware classes. To address these challenges, we propose FedP3E (Privacy-Preserving Prototype Exchange), a novel FL framework that supports indirect cross-client representation sharing while maintaining data privacy. Each client constructs class-wise prototypes using Gaussian Mixture Models (GMMs), perturbs them with Gaussian noise, and transmits only these compact summaries to the server. The aggregated prototypes are then distributed back to clients and integrated into local training, supported by SMOTE-based augmentation to enhance representation of minority malware classes. Rather than relying solely on parameter averaging, our prototype-driven mechanism enables clients to enrich their local models with complementary structural patterns observed across the federation -- without exchanging raw data or gradients. This targeted strategy reduces the adverse impact of statistical heterogeneity with minimal communication overhead. We evaluate FedP3E on the N-BaIoT dataset under realistic cross-silo scenarios with varying degrees of data imbalance.
Problem

Research questions and friction points this paper is trying to address.

Detecting non-IID IoT malware with privacy preservation
Addressing class imbalance in federated learning for IoT
Enhancing cross-silo FL with prototype exchange
Innovation

Methods, ideas, or system contributions that make the work stand out.

Privacy-preserving prototype exchange via GMMs
SMOTE-based augmentation for minority classes
Aggregated prototypes enrich local models
🔎 Similar Papers
No similar papers found.