🤖 AI Summary
This work proposes COCO-EF, a novel approach to address the communication bottleneck and straggler issues in distributed learning by systematically integrating biased gradient compression with error feedback within a gradient coding framework. In this method, non-straggling workers encode their local gradients, incorporate historical compression errors, and transmit the results via a biased compression function; the server then aggregates these compressed updates to approximate the global gradient for model refinement. Theoretical analysis establishes the convergence of the algorithm under standard assumptions, and extensive experiments demonstrate that COCO-EF significantly improves communication efficiency while maintaining high learning accuracy, outperforming existing baseline methods.
📝 Abstract
Communication bottlenecks and the presence of stragglers pose significant challenges in distributed learning (DL). To deal with these challenges, recent advances leverage unbiased compression functions and gradient coding. However, the significant benefits of biased compression remain largely unexplored. To close this gap, we propose Compressed Gradient Coding with Error Feedback (COCO-EF), a novel DL method that combines gradient coding with biased compression to mitigate straggler effects and reduce communication costs. In each iteration, non-straggler devices encode local gradients from redundantly allocated training data, incorporate prior compression errors, and compress the results using biased compression functions before transmission. The server aggregates these compressed messages from the non-stragglers to approximate the global gradient for model updates. We provide rigorous theoretical convergence guarantees for COCO-EF and validate its superior learning performance over baseline methods through empirical evaluations. As far as we know, we are among the first to rigorously demonstrate that biased compression has substantial benefits in DL, when gradient coding is employed to cope with stragglers.