Efficient Adaptive Federated Optimization

📅 2024-10-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

To address the high communication and memory overheads arising from joint server-client adaptation in cross-device federated learning, this paper proposes FedAda²—the first efficient joint adaptation framework that eliminates the need to transmit preconditioners—and its memory-optimized variant, FedAda²++. Our approach decouples preconditioner computation and employs lightweight client-side adaptive updates. Under non-convex objectives, we rigorously prove that FedAda² achieves the same convergence rate as full adaptive methods. Extensive experiments on image and text benchmarks demonstrate that, compared to baseline adaptive federated optimizers, FedAda² reduces communication costs by 40%–65% and cuts client memory footprint by 3–5×. These gains significantly enhance scalability and resource efficiency for large-scale federated deployments.

Technology Category

Application Category

📝 Abstract

Adaptive optimization is critical in federated learning, where enabling adaptivity on both the server and client sides has proven essential for achieving optimal performance. However, the scalability of such jointly adaptive systems is often hindered by resource limitations in communication and memory. In this paper, we introduce a class of efficient adaptive algorithms, named $FedAda^2$ and its enhanced version $FedAda^2$++, designed specifically for large-scale, cross-device federated environments. $FedAda^2$ optimizes communication efficiency by avoiding the transfer of preconditioners between the server and clients. Additionally, $FedAda^2$++ extends this approach by incorporating memory-efficient adaptive optimizers on the client side, further reducing on-device memory usage. Theoretically, we demonstrate that $FedAda^2$ and $FedAda^2$++ achieve the same convergence rates for general, non-convex objectives as its more resource-intensive counterparts that directly integrate joint adaptivity. Extensive empirical evaluations on image and text datasets demonstrate both the advantages of joint adaptivity and the effectiveness of $FedAda^2$/$FedAda^2$++.

Problem

Research questions and friction points this paper is trying to address.

Enhances scalability in federated learning

Reduces communication and memory resource usage

Achieves efficient adaptive optimization cross-device

Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient adaptive algorithms for federated learning

Optimized communication by avoiding preconditioner transfer

Memory-efficient adaptive optimizers on client side

🔎 Similar Papers

No similar papers found.