Addressing Data Heterogeneity in Federated Learning with Adaptive Normalization-Free Feature Recalibration

📅 2024-10-02
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the core challenge of degraded model performance caused by client statistical heterogeneity in federated learning, this paper proposes a lightweight, plug-and-play backbone enhancement method. The method introduces a novel normalization-agnostic feature recalibration mechanism that couples weight standardization with channel-wise attention—marking the first such design—thereby significantly improving class selectivity and robustness of attention distributions without batch normalization. It also achieves the first privacy-utility Pareto-optimal trade-off under differential privacy constraints (DP-SGD, ε = 2–4). Compatible with arbitrary aggregation strategies, the method supports both global and personalized federated learning paradigms. Extensive experiments demonstrate consistent and significant improvements over state-of-the-art methods across diverse datasets, heterogeneity settings, and aggregation algorithms, with negligible computational overhead and accuracy approaching that of non-private baseline models.

Technology Category

Application Category

📝 Abstract
Federated learning is a decentralized collaborative training paradigm that preserves stakeholders' data ownership while improving performance and generalization. However, statistical heterogeneity among client datasets poses a fundamental challenge by degrading system performance. To address this issue, we propose Adaptive Normalization-free Feature Recalibration (ANFR), an architecture-level approach that combines weight standardization and channel attention. Weight standardization normalizes the weights of layers instead of activations. This is less susceptible to mismatched client statistics and inconsistent averaging, thereby more robust under heterogeneity. Channel attention produces learnable scaling factors for feature maps, suppressing those that are inconsistent between clients due to heterogeneity. We demonstrate that combining these techniques boosts model performance beyond their individual contributions, by enhancing class selectivity and optimizing channel attention weight distribution. ANFR operates independently of the aggregation method and is effective in both global and personalized federated learning settings, with minimal computational overhead. Furthermore, when training with differential privacy, ANFR achieves an appealing balance between privacy and utility, enabling strong privacy guarantees without sacrificing performance. By integrating weight standardization and channel attention in the backbone model, ANFR offers a novel and versatile approach to the challenge of statistical heterogeneity. We demonstrate through extensive experiments that ANFR consistently outperforms established baselines across various aggregation methods, datasets, and heterogeneity conditions.
Problem

Research questions and friction points this paper is trying to address.

Addressing statistical heterogeneity in federated learning systems
Combining weight standardization and channel attention techniques
Improving model performance under data distribution inconsistencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Weight standardization for client statistics robustness
Channel attention for learnable feature scaling factors
Integration enabling privacy-utility balance in FL
🔎 Similar Papers
No similar papers found.
V
Vasilis Siomos
CitAI Research Centre, Department of Computer Science, City St George’s, University of London
S
Sergio Naval Marimont
CitAI Research Centre, Department of Computer Science, City St George’s, University of London
Jonathan Passerat-Palmbach
Jonathan Passerat-Palmbach
Imperial College London, Flashbots
Privacy Enhancing TechnologiesFederated LearningAI & PrivacyPrivacy-Preserving Machine
G
G. Tarroni
CitAI Research Centre, Department of Computer Science, City St George’s, University of London and BioMedIA, Department of Computing, Imperial College London