LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression

📅 2024-03-07
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of high communication overhead, latency, and data heterogeneity in federated learning, this paper proposes LoCoDL—a novel algorithm integrating local iterative updates with unbiased gradient compression (supporting both sparsification and quantization) to substantially reduce communication frequency and payload size. Theoretically, LoCoDL is the first method to achieve doubly accelerated communication complexity under strongly convex and heterogeneous data settings—its bound depends simultaneously on the condition number and model dimension—and it unifies arbitrary unbiased compressors while attaining optimal convergence rates. Empirically, LoCoDL significantly outperforms state-of-the-art distributed and federated learning methods in both total communication bits and end-to-end training time, demonstrating particular efficacy in bandwidth-constrained real-world deployments.

Technology Category

Application Category

📝 Abstract
In Distributed optimization and Learning, and even more in the modern framework of federated learning, communication, which is slow and costly, is critical. We introduce LoCoDL, a communication-efficient algorithm that leverages the two popular and effective techniques of Local training, which reduces the communication frequency, and Compression, in which short bitstreams are sent instead of full-dimensional vectors of floats. LoCoDL works with a large class of unbiased compressors that includes widely-used sparsification and quantization methods. LoCoDL provably benefits from local training and compression and enjoys a doubly-accelerated communication complexity, with respect to the condition number of the functions and the model dimension, in the general heterogenous regime with strongly convex functions. This is confirmed in practice, with LoCoDL outperforming existing algorithms.
Problem

Research questions and friction points this paper is trying to address.

Reduces communication frequency in distributed learning
Uses compression to send short bitstreams instead of full vectors
Improves communication complexity in heterogeneous, strongly convex settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Local training reduces communication frequency
Compression sends short bitstreams instead of vectors
Supports unbiased compressors like sparsification, quantization
🔎 Similar Papers
No similar papers found.
Laurent Condat
Laurent Condat
Senior Research Scientist, King Abdullah University of Science and Technology (KAUST), Saudi Arabia
optimizationconvex optimizationnonsmooth optimizationfederated learningsignal and image processing
A
Artavazd Maranjyan
Computer Science Program, CEMSE Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
P
Peter Richt'arik
Computer Science Program, CEMSE Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia; SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence (SDAIA-KAUST AI)