🤖 AI Summary
To address model drift and degraded generalization caused by non-IID data in federated learning, this paper proposes pFedKD-WCL—a novel algorithm integrating knowledge distillation with bi-level optimization to jointly achieve global convergence and local personalization. Its key contribution is the first incorporation of a weighted composite loss into the federated knowledge distillation framework, dynamically balancing the teacher-guided loss (ensuring global consistency) and the local fitting loss (enhancing client-specific adaptation). The method employs multi-layer perceptrons and logistic regression, and is empirically validated on MNIST and synthetic non-IID datasets. Results demonstrate that pFedKD-WCL significantly outperforms FedAvg, FedProx, Per-FedAvg, and pFedMe, achieving state-of-the-art performance in both test accuracy and convergence speed.
📝 Abstract
Federated learning (FL) offers a privacy-preserving framework for distributed machine learning, enabling collaborative model training across diverse clients without centralizing sensitive data. However, statistical heterogeneity, characterized by non-independent and identically distributed (non-IID) client data, poses significant challenges, leading to model drift and poor generalization. This paper proposes a novel algorithm, pFedKD-WCL (Personalized Federated Knowledge Distillation with Weighted Combination Loss), which integrates knowledge distillation with bi-level optimization to address non-IID challenges. pFedKD-WCL leverages the current global model as a teacher to guide local models, optimizing both global convergence and local personalization efficiently. We evaluate pFedKD-WCL on the MNIST dataset and a synthetic dataset with non-IID partitioning, using multinomial logistic regression and multilayer perceptron models. Experimental results demonstrate that pFedKD-WCL outperforms state-of-the-art algorithms, including FedAvg, FedProx, Per-FedAvg, and pFedMe, in terms of accuracy and convergence speed.