Parameter Tracking in Federated Learning with Adaptive Optimization

๐Ÿ“… 2025-02-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Severe data heterogeneity across clients in federated learning severely degrades model convergence, and existing gradient tracking (GT) methods are limited to SGD, lacking compatibility with mainstream adaptive optimizers such as Adam. Method: We propose a novel *parameter tracking* (PT) paradigm that generalizes GT from the gradient space to the parameter spaceโ€”enabling, for the first time, tight integration with Adam. Based on PT, we design two new federated adaptive algorithms: FAdamGT and FAdamET. Contribution/Results: Theoretically, we provide the first rigorous convergence guarantee for adaptive federated optimization under non-convex objectives. Technically, we achieve this via distributed first-order information correction and a principled federated adaptation of Adam, balancing communication efficiency and convergence stability. Extensive experiments demonstrate that our methods significantly reduce both communication and computational overhead across diverse heterogeneity settings, consistently outperforming state-of-the-art federated SGD and adaptive baselines.

Technology Category

Application Category

๐Ÿ“ Abstract
In Federated Learning (FL), model training performance is strongly impacted by data heterogeneity across clients. Gradient Tracking (GT) has recently emerged as a solution which mitigates this issue by introducing correction terms to local model updates. To date, GT has only been considered under Stochastic Gradient Descent (SGD)-based model training, while modern FL frameworks increasingly employ adaptive optimizers for improved convergence. In this work, we generalize the GT framework to a more flexible Parameter Tracking (PT) paradigm and propose two novel adaptive optimization algorithms, { t FAdamET} and { t FAdamGT}, that integrate PT into Adam-based FL. We provide a rigorous convergence analysis of these algorithms under non-convex settings. Our experimental results demonstrate that both proposed algorithms consistently outperform existing methods when evaluating total communication cost and total computation cost across varying levels of data heterogeneity, showing the effectiveness of correcting first-order information in federated adaptive optimization.
Problem

Research questions and friction points this paper is trying to address.

Address data heterogeneity in Federated Learning
Extend Gradient Tracking to adaptive optimizers
Enhance convergence with Parameter Tracking in FL
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive optimization in FL
Parameter Tracking paradigm
Novel FAdamET and FAdamGT algorithms
๐Ÿ”Ž Similar Papers
No similar papers found.
E
Evan Chen
Elmore Family School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN
Jianing Zhang
Jianing Zhang
Purdue University
Federated LearningMultiple Agent SystemsDifferential Privacy
Shiqiang Wang
Shiqiang Wang
IBM T. J. Watson Research Center
Agentic AICollaborative & Federated AILLMsMachine LearningOptimization Algorithms
Chaoyue Liu
Chaoyue Liu
Purdue University, ECE department
Deep Learning TheoryMathematical Foundation of Deep LearningOptimization
C
Christopher Brinton
Elmore Family School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN