An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy

📅 2025-03-26

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Federated learning (FL) introduces non-trivial accuracy degradation compared to centralized training, yet systematic, cross-modal quantification of this impact remains lacking. Method: This work conducts a comprehensive empirical evaluation of FL’s effect on model accuracy across four modalities—text, image, audio, and video—using representative architectures (e.g., BERT, ResNet, Wav2Vec) under a unified FL framework. We vary critical factors including data distribution (IID, non-IID, long-tailed), dataset scale, client sampling strategies, and local/global computation configurations. Contribution/Results: To our knowledge, this is the first study to quantify accuracy gains and losses across modalities, tasks, and FL configurations. Results show average accuracy drops of 1.2–8.7% under non-IID settings and up to 23.1% under long-tailed distributions, while balanced small-scale data incurs as little as 0.3% loss. We identify high-risk scenarios for severe degradation and robust operational regimes, enabling principled, quantitative accuracy risk assessment for real-world FL deployment.

Technology Category

Application Category

📝 Abstract

Federated Learning (FL) enables distributed ML model training on private user data at the global scale. Despite the potential of FL demonstrated in many domains, an in-depth view of its impact on model accuracy remains unclear. In this paper, we investigate, systematically, how this learning paradigm can affect the accuracy of state-of-the-art ML models for a variety of ML tasks. We present an empirical study that involves various data types: text, image, audio, and video, and FL configuration knobs: data distribution, FL scale, client sampling, and local and global computations. Our experiments are conducted in a unified FL framework to achieve high fidelity, with substantial human efforts and resource investments. Based on the results, we perform a quantitative analysis of the impact of FL, and highlight challenging scenarios where applying FL degrades the accuracy of the model drastically and identify cases where the impact is negligible. The detailed and extensive findings can benefit practical deployments and future development of FL.

Problem

Research questions and friction points this paper is trying to address.

Investigates FL's impact on ML model accuracy

Examines FL effects across diverse data types

Identifies scenarios where FL degrades model performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning for distributed ML training

Empirical study on FL impact accuracy

Unified FL framework for high fidelity

🔎 Similar Papers

From Challenges and Pitfalls to Recommendations and Opportunities: Implementing Federated Learning in Healthcare