🤖 AI Summary
Federated learning suffers significant performance degradation under data distribution shifts and local outlier contamination, yet existing distributionally robust approaches largely overlook the non-geometric disruptive effects of outliers. To address this, we propose an outlier-robust distributionally robust federated learning framework. Our method is the first to jointly incorporate imbalanced Wasserstein distance and KL-divergence regularization into the ambiguity set construction, thereby unifying the modeling of geometric distributional shifts and non-geometric outlier impacts. This leads to a decomposable min-max-max optimization problem, for which we provide theoretical guarantees on robustness and convergence. Extensive experiments on synthetic and real-world benchmarks demonstrate that our approach consistently outperforms state-of-the-art federated learning algorithms—particularly under concurrent distribution shifts and outlier contamination—achieving superior robustness and stable convergence.
📝 Abstract
Federated learning (FL) enables collaborative model training without direct data sharing, but its performance can degrade significantly in the presence of data distribution perturbations. Distributionally robust optimization (DRO) provides a principled framework for handling this by optimizing performance against the worst-case distributions within a prescribed ambiguity set. However, existing DRO-based FL methods often overlook the detrimental impact of outliers in local datasets, which can disproportionately bias the learned models. In this work, we study distributionally robust federated learning with explicit outlier resilience. We introduce a novel ambiguity set based on the unbalanced Wasserstein distance, which jointly captures geometric distributional shifts and incorporates a non-geometric Kullback--Leibler penalization to mitigate the influence of outliers. This formulation naturally leads to a challenging min--max--max optimization problem. To enable decentralized training, we reformulate the problem as a tractable Lagrangian penalty optimization, which admits robustness certificates. Building on this reformulation, we propose the distributionally outlier-robust federated learning algorithm and establish its convergence guarantees. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of our approach.