🤖 AI Summary
Resource-constrained edge devices in federated learning (FL) face high computational overhead, low data utilization efficiency, and inadequate privacy protection. Method: We propose FedFT—a novel FL framework that jointly introduces local parameter freezing for fine-tuning and information-entropy-driven dynamic data subsampling. FedFT quantifies sample uncertainty via entropy, adaptively selects high-informativeness subsets (using only 50% of local data per client), and remains compatible with standard aggregation algorithms (e.g., FedAvg, FedProx). Contribution/Results: On CIFAR-10 and CIFAR-100, FedFT reduces per-client training time to one-third of baselines—achieving a 3× efficiency gain—while simultaneously improving global model accuracy. By explicitly relaxing the implicit “full-data participation” assumption in FL, FedFT significantly enhances feasibility for lightweight devices and strengthens privacy preservation through reduced data exposure and computation.
📝 Abstract
With the rapid expansion of edge devices, such as IoT devices, where crucial data needed for machine learning applications is generated, it becomes essential to promote their participation in privacy-preserving Federated Learning (FL) systems. The best way to achieve this desiderate is by reducing their training workload to match their constrained computational resources. While prior FL research has address the workload constrains by introducing lightweight models on the edge, limited attention has been given to optimizing on-device training efficiency through reducing the amount of data need during training. In this work, we propose FedFT-EDS, a novel approach that combines Fine-Tuning of partial client models with Entropy-based Data Selection to reduce training workloads on edge devices. By actively selecting the most informative local instances for learning, FedFT-EDS reduces training data significantly in FL and demonstrates that not all user data is equally beneficial for FL on all rounds. Our experiments on CIFAR-10 and CIFAR-100 show that FedFT-EDS uses only 50% user data while improving the global model performance compared to baseline methods, FedAvg and FedProx. Importantly, FedFT-EDS improves client learning efficiency by up to 3 times, using one third of training time on clients to achieve an equivalent performance to the baselines. This work highlights the importance of data selection in FL and presents a promising pathway to scalable and efficient Federate Learning.