🤖 AI Summary
Federated learning (FL) faces dual-faced attacks that simultaneously compromise both model utility and fairness, yet existing defenses lack effective countermeasures. This paper proposes GuardFed, the first framework to formalize the Dual-Faced Attack (DFA) threat model and its variants (S-DFA, Sp-DFA), exposing critical vulnerabilities of mainstream robust FL methods in preserving group fairness. GuardFed introduces a fairness-aware reference model—trained on a small set of clean server-side data augmented with synthetically generated samples—to jointly assess client trust along two dimensions: utility deviation and fairness degradation. It further employs an adaptive trust-weighted aggregation mechanism. Extensive experiments across diverse Non-IID and adversarial settings demonstrate that GuardFed consistently outperforms state-of-the-art methods on multiple real-world datasets, achieving high accuracy while significantly and stably improving group fairness—thereby realizing synergistic robustness to both utility loss and fairness violation.
📝 Abstract
Federated learning (FL) enables privacy-preserving collaborative model training but remains vulnerable to adversarial behaviors that compromise model utility or fairness across sensitive groups. While extensive studies have examined attacks targeting either objective, strategies that simultaneously degrade both utility and fairness remain largely unexplored. To bridge this gap, we introduce the Dual-Facet Attack (DFA), a novel threat model that concurrently undermines predictive accuracy and group fairness. Two variants, Synchronous DFA (S-DFA) and Split DFA (Sp-DFA), are further proposed to capture distinct real-world collusion scenarios. Experimental results show that existing robust FL defenses, including hybrid aggregation schemes, fail to resist DFAs effectively. To counter these threats, we propose GuardFed, a self-adaptive defense framework that maintains a fairness-aware reference model using a small amount of clean server data augmented with synthetic samples. In each training round, GuardFed computes a dual-perspective trust score for every client by jointly evaluating its utility deviation and fairness degradation, thereby enabling selective aggregation of trustworthy updates. Extensive experiments on real-world datasets demonstrate that GuardFed consistently preserves both accuracy and fairness under diverse non-IID and adversarial conditions, achieving state-of-the-art performance compared with existing robust FL methods.