🤖 AI Summary
In distributed learning, gradient compression alleviates communication bottlenecks but often degrades convergence. Existing error-feedback (EF/EF21) methods lack tight theoretical analyses, hindering precise characterization of their fundamental performance limits. This paper constructs the first *optimal* Lyapunov functions for both EF and EF21, enabling derivation of *matching upper and lower bounds* on their convergence rates under convex smooth settings. These bounds facilitate a rigorous, apples-to-apples comparison among three key algorithms: EF, EF21, and compressed gradient descent. The resulting convergence rates constitute the tightest known theoretical guarantees to date. Crucially, they reveal—for the first time—the intrinsic impact of the error-feedback mechanism on convergence speed and its fundamental performance ceiling. This work thus establishes a solid theoretical foundation for designing and selecting communication-efficient compression strategies in distributed optimization.
📝 Abstract
Communication between agents often constitutes a major computational bottleneck in distributed learning. One of the most common mitigation strategies is to compress the information exchanged, thereby reducing communication overhead. To counteract the degradation in convergence associated with compressed communication, error feedback schemes -- most notably $mathrm{EF}$ and $mathrm{EF}^{21}$ -- were introduced. In this work, we provide a tight analysis of both of these methods. Specifically, we find the Lyapunov function that yields the best possible convergence rate for each method -- with matching lower bounds. This principled approach yields sharp performance guarantees and enables a rigorous, apples-to-apples comparison between $mathrm{EF}$, $mathrm{EF}^{21}$, and compressed gradient descent. Our analysis is carried out in a simplified yet representative setting, which allows for clean theoretical insights and fair comparison of the underlying mechanisms.