🤖 AI Summary
Federated learning (FL) introduces non-trivial accuracy degradation compared to centralized training, yet systematic, cross-modal quantification of this impact remains lacking. Method: This work conducts a comprehensive empirical evaluation of FL’s effect on model accuracy across four modalities—text, image, audio, and video—using representative architectures (e.g., BERT, ResNet, Wav2Vec) under a unified FL framework. We vary critical factors including data distribution (IID, non-IID, long-tailed), dataset scale, client sampling strategies, and local/global computation configurations. Contribution/Results: To our knowledge, this is the first study to quantify accuracy gains and losses across modalities, tasks, and FL configurations. Results show average accuracy drops of 1.2–8.7% under non-IID settings and up to 23.1% under long-tailed distributions, while balanced small-scale data incurs as little as 0.3% loss. We identify high-risk scenarios for severe degradation and robust operational regimes, enabling principled, quantitative accuracy risk assessment for real-world FL deployment.
📝 Abstract
Federated Learning (FL) enables distributed ML model training on private user data at the global scale. Despite the potential of FL demonstrated in many domains, an in-depth view of its impact on model accuracy remains unclear. In this paper, we investigate, systematically, how this learning paradigm can affect the accuracy of state-of-the-art ML models for a variety of ML tasks. We present an empirical study that involves various data types: text, image, audio, and video, and FL configuration knobs: data distribution, FL scale, client sampling, and local and global computations. Our experiments are conducted in a unified FL framework to achieve high fidelity, with substantial human efforts and resource investments. Based on the results, we perform a quantitative analysis of the impact of FL, and highlight challenging scenarios where applying FL degrades the accuracy of the model drastically and identify cases where the impact is negligible. The detailed and extensive findings can benefit practical deployments and future development of FL.