Data Heterogeneity and Forgotten Labels in Split Federated Learning

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This paper addresses catastrophic forgetting in Split Federated Learning (SFL), caused by data heterogeneity and model partitioning—specifically, sequential serialization of client activation values at the server, which induces bias toward later classes and neglect of earlier ones. To mitigate this, we propose Hydra, a lightweight parallel head architecture inspired by multi-head mechanisms, tailored to SFL’s Part-1/Part-2 model splitting paradigm. Hydra jointly suppresses local model drift and server-side order sensitivity. Theoretical analysis and extensive experiments demonstrate that Hydra significantly alleviates label-level forgetting: it improves overall accuracy by 3.2–7.8% on CIFAR-10, CIFAR-100, and Tiny-ImageNet, while substantially enhancing class balance—boosting F1-score for tail classes by up to 12.4%. Hydra consistently outperforms baselines including FedAvg and SplitFed.

Technology Category

Application Category

📝 Abstract

In Split Federated Learning (SFL), the clients collaboratively train a model with the help of a server by splitting the model into two parts. Part-1 is trained locally at each client and aggregated by the aggregator at the end of each round. Part-2 is trained at a server that sequentially processes the intermediate activations received from each client. We study the phenomenon of catastrophic forgetting (CF) in SFL in the presence of data heterogeneity. In detail, due to the nature of SFL, local updates of part-1 may drift away from global optima, while part-2 is sensitive to the processing sequence, similar to forgetting in continual learning (CL). Specifically, we observe that the trained model performs better in classes (labels) seen at the end of the sequence. We investigate this phenomenon with emphasis on key aspects of SFL, such as the processing order at the server and the cut layer. Based on our findings, we propose Hydra, a novel mitigation method inspired by multi-head neural networks and adapted for the SFL's setting. Extensive numerical evaluations show that Hydra outperforms baselines and methods from the literature.

Problem

Research questions and friction points this paper is trying to address.

Addressing catastrophic forgetting in split federated learning

Mitigating data heterogeneity and label sequence bias

Improving model performance across all learned labels

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-head neural networks adapted for Split Federated Learning

Mitigates catastrophic forgetting in heterogeneous data settings

Addresses label bias via server-side processing sequence optimization

🔎 Similar Papers

No similar papers found.