🤖 AI Summary
To address high training latency, excessive communication overhead, and low participation efficiency of resource-constrained clients in Split Federated Learning (SFL) under edge-device computational heterogeneity, this paper proposes a novel three-stage model partitioning architecture. The architecture enables concurrent forward propagation, local training, and gradient aggregation across the server and both strong and weak clients, eliminating sequential dependencies inherent in conventional SFL. Key innovations include model partition adaptation to device capabilities, heterogeneous-aware coordination scheduling, parallelized gradient aggregation, and distributed model update mechanisms—collectively reducing communication rounds and computational idle time. Extensive experiments demonstrate that our method outperforms existing SFL and standard FL baselines in convergence speed, final model accuracy, and end-to-end training latency, particularly in mixed heterogeneous edge environments.
📝 Abstract
Federated learning (FL) operates based on model exchanges between the server and the clients, and it suffers from significant client-side computation and communication burden. Split federated learning (SFL) arises a promising solution by splitting the model into two parts, that are trained sequentially: the clients train the first part of the model (client-side model) and transmit it to the server that trains the second (server-side model). Existing SFL schemes though still exhibit long training delays and significant communication overhead, especially when clients of different computing capability participate. Thus, we propose Collaborative-Split Federated Learning~(C-SFL), a novel scheme that splits the model into three parts, namely the model parts trained at the computationally weak clients, the ones trained at the computationally strong clients, and the ones at the server. Unlike existing works, C-SFL enables parallel training and aggregation of model's parts at the clients and at the server, resulting in reduced training delays and commmunication overhead while improving the model's accuracy. Experiments verify the multiple gains of C-SFL against the existing schemes.