LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression

📅 2024-03-07

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

260K/year

🤖 AI Summary

To address the challenges of high communication overhead, latency, and data heterogeneity in federated learning, this paper proposes LoCoDL—a novel algorithm integrating local iterative updates with unbiased gradient compression (supporting both sparsification and quantization) to substantially reduce communication frequency and payload size. Theoretically, LoCoDL is the first method to achieve doubly accelerated communication complexity under strongly convex and heterogeneous data settings—its bound depends simultaneously on the condition number and model dimension—and it unifies arbitrary unbiased compressors while attaining optimal convergence rates. Empirically, LoCoDL significantly outperforms state-of-the-art distributed and federated learning methods in both total communication bits and end-to-end training time, demonstrating particular efficacy in bandwidth-constrained real-world deployments.

Technology Category

Application Category

📝 Abstract

In Distributed optimization and Learning, and even more in the modern framework of federated learning, communication, which is slow and costly, is critical. We introduce LoCoDL, a communication-efficient algorithm that leverages the two popular and effective techniques of Local training, which reduces the communication frequency, and Compression, in which short bitstreams are sent instead of full-dimensional vectors of floats. LoCoDL works with a large class of unbiased compressors that includes widely-used sparsification and quantization methods. LoCoDL provably benefits from local training and compression and enjoys a doubly-accelerated communication complexity, with respect to the condition number of the functions and the model dimension, in the general heterogenous regime with strongly convex functions. This is confirmed in practice, with LoCoDL outperforming existing algorithms.

Problem

Research questions and friction points this paper is trying to address.

Reduces communication frequency in distributed learning

Uses compression to send short bitstreams instead of full vectors

Improves communication complexity in heterogeneous, strongly convex settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Local training reduces communication frequency

Compression sends short bitstreams instead of vectors

Supports unbiased compressors like sparsification, quantization

🔎 Similar Papers

FedComLoc: Communication-Efficient Distributed Training of Sparse and Quantized Models