🤖 AI Summary
Federated learning (FL) faces challenges in medical applications due to data heterogeneity, limited edge-device resources, and the need for robust, reproducible baselines. Method: This work conducts a systematic empirical evaluation of the FedAvg algorithm on real-world medical imaging tasks—specifically blood cell and skin lesion classification—under diverse conditions: multiple model architectures (including Vision Transformers), non-IID data distributions, and varied hyperparameter configurations. Contribution/Results: We provide the first comprehensive validation of FedAvg’s outlier robustness in authentic clinical imaging settings. FedAvg consistently achieves high performance across all configurations without extensive hyperparameter tuning, significantly outperforming several state-of-the-art FL methods. Its lightweight deployment footprint and strong generalization make it especially suitable for resource-constrained clinical edge environments. This study establishes FedAvg as a trustworthy, simple, reproducible, and clinically viable baseline for medical FL—bridging the gap between algorithmic research and practical healthcare deployment.
📝 Abstract
Federated Learning (FL) is a distributed machine learning paradigm enabling collaborative model training across decentralized clients while preserving data privacy. In this paper, we revisit the stability of the vanilla FedAvg algorithm under diverse conditions. Despite its conceptual simplicity, FedAvg exhibits remarkably stable performance compared to more advanced FL techniques. Our experiments assess the performance of various FL methods on blood cell and skin lesion classification tasks using Vision Transformer (ViT). Additionally, we evaluate the impact of different representative classification models and analyze sensitivity to hyperparameter variations. The results consistently demonstrate that, regardless of dataset, classification model employed, or hyperparameter settings, FedAvg maintains robust performance. Given its stability, robust performance without the need for extensive hyperparameter tuning, FedAvg is a safe and efficient choice for FL deployments in resource-constrained hospitals handling medical data. These findings underscore the enduring value of the vanilla FedAvg approach as a trusted baseline for clinical practice.