Efficient Adaptive Federated Optimization

📅 2024-10-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high communication and memory overheads arising from joint server-client adaptation in cross-device federated learning, this paper proposes FedAda²—the first efficient joint adaptation framework that eliminates the need to transmit preconditioners—and its memory-optimized variant, FedAda²++. Our approach decouples preconditioner computation and employs lightweight client-side adaptive updates. Under non-convex objectives, we rigorously prove that FedAda² achieves the same convergence rate as full adaptive methods. Extensive experiments on image and text benchmarks demonstrate that, compared to baseline adaptive federated optimizers, FedAda² reduces communication costs by 40%–65% and cuts client memory footprint by 3–5×. These gains significantly enhance scalability and resource efficiency for large-scale federated deployments.

Technology Category

Application Category

📝 Abstract
Adaptive optimization is critical in federated learning, where enabling adaptivity on both the server and client sides has proven essential for achieving optimal performance. However, the scalability of such jointly adaptive systems is often hindered by resource limitations in communication and memory. In this paper, we introduce a class of efficient adaptive algorithms, named $FedAda^2$ and its enhanced version $FedAda^2$++, designed specifically for large-scale, cross-device federated environments. $FedAda^2$ optimizes communication efficiency by avoiding the transfer of preconditioners between the server and clients. Additionally, $FedAda^2$++ extends this approach by incorporating memory-efficient adaptive optimizers on the client side, further reducing on-device memory usage. Theoretically, we demonstrate that $FedAda^2$ and $FedAda^2$++ achieve the same convergence rates for general, non-convex objectives as its more resource-intensive counterparts that directly integrate joint adaptivity. Extensive empirical evaluations on image and text datasets demonstrate both the advantages of joint adaptivity and the effectiveness of $FedAda^2$/$FedAda^2$++.
Problem

Research questions and friction points this paper is trying to address.

Enhances scalability in federated learning
Reduces communication and memory resource usage
Achieves efficient adaptive optimization cross-device
Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient adaptive algorithms for federated learning
Optimized communication by avoiding preconditioner transfer
Memory-efficient adaptive optimizers on client side
🔎 Similar Papers
No similar papers found.
Su Hyeong Lee
Su Hyeong Lee
University of Chicago
OptimizationDeep LearningMachine Learning
S
Sidharth Sharma
Columbia University
M
M. Zaheer
Google DeepMind
T
Tian Li
University of Chicago