Towards Collaborative Fairness in Federated Learning Under Imbalanced Covariate Shift

📅 2025-07-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses collaborative fairness degradation in federated learning caused by imbalanced covariate shift. We propose FedAKD, a novel framework that—uniquely—identifies misclassified samples as the primary driver of this shift and introduces an asynchronous knowledge distillation mechanism leveraging high-confidence correctly classified samples. This mechanism alternates updates between clients and server to jointly optimize both model accuracy and fairness. Theoretically, we establish convergence guarantees for FedAKD. Empirically, on FashionMNIST, CIFAR-10, and a real-world electronic health record dataset, FedAKD achieves significant improvements: +12.7% in collaborative fairness and +3.9% in global accuracy, while also enhancing client participation willingness.

Technology Category

Application Category

📝 Abstract
Collaborative fairness is a crucial challenge in federated learning. However, existing approaches often overlook a practical yet complex form of heterogeneity: imbalanced covariate shift. We provide a theoretical analysis of this setting, which motivates the design of FedAKD (Federated Asynchronous Knowledge Distillation)- simple yet effective approach that balances accurate prediction with collaborative fairness. FedAKD consists of client and server updates. In the client update, we introduce a novel asynchronous knowledge distillation strategy based on our preliminary analysis, which reveals that while correctly predicted samples exhibit similar feature distributions across clients, incorrectly predicted samples show significant variability. This suggests that imbalanced covariate shift primarily arises from misclassified samples. Leveraging this insight, our approach first applies traditional knowledge distillation to update client models while keeping the global model fixed. Next, we select correctly predicted high-confidence samples and update the global model using these samples while keeping client models fixed. The server update simply aggregates all client models. We further provide a theoretical proof of FedAKD's convergence. Experimental results on public datasets (FashionMNIST and CIFAR10) and a real-world Electronic Health Records (EHR) dataset demonstrate that FedAKD significantly improves collaborative fairness, enhances predictive accuracy, and fosters client participation even under highly heterogeneous data distributions.
Problem

Research questions and friction points this paper is trying to address.

Address collaborative fairness in federated learning
Mitigate imbalanced covariate shift in heterogeneous data
Improve accuracy and client participation in FL
Innovation

Methods, ideas, or system contributions that make the work stand out.

Asynchronous knowledge distillation balances fairness
Client models update using high-confidence samples
Server aggregates models for convergence
🔎 Similar Papers
No similar papers found.
T
Tianrun Yu
The Pennsylvania State University
J
Jiaqi Wang
The Pennsylvania State University
H
Haoyu Wang
State University of New York at Albany
Mingquan Lin
Mingquan Lin
Assistant Professor at University of Minnesota
Medical image analysisDeep learning
H
Han Liu
Dalian University of Technology
N
Nelson S. Yee
The Pennsylvania State University
Fenglong Ma
Fenglong Ma
Associate Professor, Pennsylvania State University
Data MiningMachine LearningHealth Informatics