Harnessing Increased Client Participation with Cohort-Parallel Federated Learning

📅 2024-05-24

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

In federated learning (FL), scaling up the number of clients degrades per-round update efficacy, slows convergence, and increases communication overhead. To address this, we propose Cohort-Parallel FL—a novel paradigm that partitions clients into multiple disjoint cohorts; each cohort trains its model locally to convergence independently, and models are fused via a single round of cross-cohort, unlabeled knowledge distillation. This approach is the first to empirically uncover and leverage the principle that isolated small-scale networks accelerate convergence—thereby breaking the conventional constraint of global synchronous training. Evaluated on CIFAR-10 under a non-IID setting with four cohorts, our method reduces total training time by 1.9× and cuts computational and communication costs by 1.3×, while incurring only a marginal drop in test accuracy. The result is a balanced optimization across convergence speed, resource efficiency, and model performance.

Technology Category

Application Category

📝 Abstract

Federated Learning (FL) is a machine learning approach where nodes collaboratively train a global model. As more nodes participate in a round of FL, the effectiveness of individual model updates by nodes also diminishes. In this study, we increase the effectiveness of client updates by dividing the network into smaller partitions, or cohorts. We introduce Cohort-Parallel Federated Learning (CPFL): a novel learning approach where each cohort independently trains a global model using FL, until convergence, and the produced models by each cohort are then unified using one-shot Knowledge Distillation (KD) and a cross-domain, unlabeled dataset. The insight behind CPFL is that smaller, isolated networks converge quicker than in a one-network setting where all nodes participate. Through exhaustive experiments involving realistic traces and non-IID data distributions on the CIFAR-10 and FEMNIST image classification tasks, we investigate the balance between the number of cohorts, model accuracy, training time, and compute and communication resources. Compared to traditional FL, CPFL with four cohorts, non-IID data distribution, and CIFAR-10 yields a 1.9$ imes$ reduction in train time and a 1.3$ imes$ reduction in resource usage, with a minimal drop in test accuracy.

Problem

Research questions and friction points this paper is trying to address.

Enhances client update effectiveness in federated learning

Divides network into cohorts for faster convergence

Reduces training time and resource usage significantly

Innovation

Methods, ideas, or system contributions that make the work stand out.

Divides network into smaller cohorts for training

Uses knowledge distillation to unify cohort models

Reduces training time and resource usage significantly

🔎 Similar Papers

No similar papers found.