Trading-off Accuracy and Communication Cost in Federated Learning

📅 2025-03-18

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

To address the fundamental trade-off between model accuracy and communication cost in federated learning, this paper proposes a novel framework integrating weight quantization with low-dimensional parameter encoding. Specifically, it employs a fixed sparse random matrix ( Q ) to linearly project high-dimensional weights ( w in mathbb{R}^d ) onto a compact, trainable low-dimensional vector ( p in mathbb{R}^k ) (( k ll d )), such that ( w = Qp ), enabling efficient reconstruction and storage. The work establishes, for the first time, a theoretical connection between “training-by-sampling” and stochastic convex geometry, yielding rigorous convergence guarantees and geometric insights. Empirically, the method achieves lossless accuracy while reducing communication overhead by 34×; its compression efficiency surpasses state-of-the-art approaches by an order of magnitude, thereby substantially alleviating the accuracy–bandwidth trade-off bottleneck inherent in existing federated learning methods.

Technology Category

Application Category

📝 Abstract

Leveraging the training-by-pruning paradigm introduced by Zhou et al. and Isik et al. introduced a federated learning protocol that achieves a 34-fold reduction in communication cost. We achieve a compression improvements of orders of orders of magnitude over the state-of-the-art. The central idea of our framework is to encode the network weights $vec w$ by a the vector of trainable parameters $vec p$, such that $vec w = Qcdot vec p$ where $Q$ is a carefully-generate sparse random matrix (that remains fixed throughout training). In such framework, the previous work of Zhou et al. [NeurIPS'19] is retrieved when $Q$ is diagonal and $vec p$ has the same dimension of $vec w$. We instead show that $vec p$ can effectively be chosen much smaller than $vec w$, while retaining the same accuracy at the price of a decrease of the sparsity of $Q$. Since server and clients only need to share $vec p$, such a trade-off leads to a substantial improvement in communication cost. Moreover, we provide theoretical insight into our framework and establish a novel link between training-by-sampling and random convex geometry.

Problem

Research questions and friction points this paper is trying to address.

Reduces communication cost in federated learning

Improves compression over state-of-the-art methods

Links training-by-sampling to random convex geometry

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reduces communication cost 34-fold in federated learning

Uses sparse random matrix for weight encoding

Achieves high compression with minimal accuracy loss

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Amazon

193,300.00 - 261,500.00 USD annually

Cupertino, CA, USA

Authors to Follow