FedCGD: Collective Gradient Divergence Optimized Scheduling for Wireless Federated Learning

📅 2025-06-09

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

To address slow convergence in wireless federated learning caused by data heterogeneity and bandwidth constraints, this paper proposes, for the first time, a collective gradient divergence (CGD) metric jointly modeling device-level and sample-level heterogeneity. Device-level CGD is defined as the weighted earth mover’s distance (WEMD) between group-wise and global data distributions, while sampling variance constraints enable decoupled multi-objective optimization. We design a polynomial-time scheduling algorithm that, on CIFAR-10, improves classification accuracy by 4.2% and reduces device participation by 41.8%, while supporting dynamic trade-offs between WEMD and sampling variance. The core innovation lies in redefining gradient divergence from a population-level collaborative perspective—departing from conventional single-device bias modeling—thereby significantly enhancing communication efficiency and model convergence performance.

Technology Category

Application Category

📝 Abstract

Federated learning (FL) is a promising paradigm for multiple devices to cooperatively train a model. When applied in wireless networks, two issues consistently affect the performance of FL, i.e., data heterogeneity of devices and limited bandwidth. Many papers have investigated device scheduling strategies considering the two issues. However, most of them recognize data heterogeneity as a property of individual devices. In this paper, we prove that the convergence speed of FL is affected by the sum of device-level and sample-level collective gradient divergence (CGD). The device-level CGD refers to the gradient divergence of the scheduled device group, instead of the sum of the individual device divergence. The sample-level CGD is statistically upper bounded by sampling variance, which is inversely proportional to the total number of samples scheduled for local update. To derive a tractable form of the device-level CGD, we further consider a classification problem and transform it into the weighted earth moving distance (WEMD) between the group distribution and the global distribution. Then we propose FedCGD algorithm to minimize the sum of multi-level CGDs by balancing WEMD and sampling variance, within polynomial time. Simulation shows that the proposed strategy increases classification accuracy on the CIFAR-10 dataset by up to 4.2% while scheduling 41.8% fewer devices, and flexibly switches between reducing WEMD and reducing sampling variance.

Problem

Research questions and friction points this paper is trying to address.

Optimizing FL convergence by minimizing collective gradient divergence

Balancing device-level and sample-level gradient divergence in wireless FL

Improving classification accuracy while reducing scheduled devices

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizes collective gradient divergence for FL

Uses weighted earth moving distance metric

Balances WEMD and sampling variance efficiently

🔎 Similar Papers

Online-Score-Aided Federated Learning: Taming the Resource Constraints in Wireless Networks