SparsyFed: Sparse Adaptive Federated Training

📅 2025-04-07

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

To address the challenges of degraded sparse training performance due to data heterogeneity, non-adaptive masking strategies, and complex hyperparameter tuning in cross-device federated learning (FL), this paper proposes FedSparse—the first practical federated sparse training framework. FedSparse introduces dynamic mask regeneration and client-local sparse pattern adaptation to jointly achieve model consensus, data-aware masking, and single-hyperparameter-controllable optimization. Without additional tuning, it maintains near-lossless accuracy at 95% sparsity using only one hyperparameter. Weight regeneration overhead is reduced to just 0.5% of baseline methods. Under severe data heterogeneity, FedSparse consistently outperforms all existing sparse FL approaches, while simultaneously ensuring high communication efficiency, low computational overhead, and robustness across diverse client distributions.

Technology Category

Application Category

📝 Abstract

Sparse training is often adopted in cross-device federated learning (FL) environments where constrained devices collaboratively train a machine learning model on private data by exchanging pseudo-gradients across heterogeneous networks. Although sparse training methods can reduce communication overhead and computational burden in FL, they are often not used in practice for the following key reasons: (1) data heterogeneity makes it harder for clients to reach consensus on sparse models compared to dense ones, requiring longer training; (2) methods for obtaining sparse masks lack adaptivity to accommodate very heterogeneous data distributions, crucial in cross-device FL; and (3) additional hyperparameters are required, which are notably challenging to tune in FL. This paper presents SparsyFed, a practical federated sparse training method that critically addresses the problems above. Previous works have only solved one or two of these challenges at the expense of introducing new trade-offs, such as clients' consensus on masks versus sparsity pattern adaptivity. We show that SparsyFed simultaneously (1) can produce 95% sparse models, with negligible degradation in accuracy, while only needing a single hyperparameter, (2) achieves a per-round weight regrowth 200 times smaller than previous methods, and (3) allows the sparse masks to adapt to highly heterogeneous data distributions and outperform all baselines under such conditions.

Problem

Research questions and friction points this paper is trying to address.

Addresses data heterogeneity in federated sparse training

Improves adaptivity of sparse masks to heterogeneous data

Reduces hyperparameter tuning complexity in federated learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive sparse masks for heterogeneous data

Single hyperparameter for 95% sparsity

Minimal weight regrowth per training round

🔎 Similar Papers

No similar papers found.