🤖 AI Summary
Federated learning (FL) faces challenges in mounting stealthy and persistent backdoor attacks, as existing label-supervised approaches are easily detectable and lack robustness. To address this, we propose the first feature-space-alignment-based backdoor attack framework that is both covert and enduring. Departing from conventional label-dependent trigger mechanisms, our method implants semantically consistent backdoors by minimizing the distance between trigger-embedded and target-class samples in the feature space. It integrates adaptive adversarial trigger optimization, boundary search, and gradient obfuscation to enhance attack robustness against non-IID data distributions and state-of-the-art FL defenses. Experiments across multiple FL benchmarks demonstrate near-perfect attack success rates (>99%), negligible utility degradation (<2% accuracy drop), and sustained backdoor effectiveness over 20+ communication rounds—significantly outperforming prior work in stealth, persistence, and resilience.
📝 Abstract
Federated Learning (FL) has emerged as a leading paradigm for privacy-preserving distributed machine learning, yet the distributed nature of FL introduces unique security challenges, notably the threat of backdoor attacks. Existing backdoor strategies predominantly rely on end-to-end label supervision, which, despite their efficacy, often results in detectable feature disentanglement and limited persistence. In this work, we propose a novel and stealthy backdoor attack framework, named SPA, which fundamentally departs from traditional approaches by leveraging feature-space alignment rather than direct trigger-label association. Specifically, SPA reduces representational distances between backdoor trigger features and target class features, enabling the global model to misclassify trigger-embedded inputs with high stealth and persistence. We further introduce an adaptive, adversarial trigger optimization mechanism, utilizing boundary-search in the feature space to enhance attack longevity and effectiveness, even against defensive FL scenarios and non-IID data distributions. Extensive experiments on various FL benchmarks demonstrate that SPA consistently achieves high attack success rates with minimal impact on model utility, maintains robustness under challenging participation and data heterogeneity conditions, and exhibits persistent backdoor effects far exceeding those of conventional techniques. Our results call urgent attention to the evolving sophistication of backdoor threats in FL and emphasize the pressing need for advanced, feature-level defense techniques.