Parameter Tracking in Federated Learning with Adaptive Optimization

📅 2025-02-04

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Severe data heterogeneity across clients in federated learning severely degrades model convergence, and existing gradient tracking (GT) methods are limited to SGD, lacking compatibility with mainstream adaptive optimizers such as Adam. Method: We propose a novel *parameter tracking* (PT) paradigm that generalizes GT from the gradient space to the parameter space—enabling, for the first time, tight integration with Adam. Based on PT, we design two new federated adaptive algorithms: FAdamGT and FAdamET. Contribution/Results: Theoretically, we provide the first rigorous convergence guarantee for adaptive federated optimization under non-convex objectives. Technically, we achieve this via distributed first-order information correction and a principled federated adaptation of Adam, balancing communication efficiency and convergence stability. Extensive experiments demonstrate that our methods significantly reduce both communication and computational overhead across diverse heterogeneity settings, consistently outperforming state-of-the-art federated SGD and adaptive baselines.

Technology Category

Application Category

📝 Abstract

In Federated Learning (FL), model training performance is strongly impacted by data heterogeneity across clients. Gradient Tracking (GT) has recently emerged as a solution which mitigates this issue by introducing correction terms to local model updates. To date, GT has only been considered under Stochastic Gradient Descent (SGD)-based model training, while modern FL frameworks increasingly employ adaptive optimizers for improved convergence. In this work, we generalize the GT framework to a more flexible Parameter Tracking (PT) paradigm and propose two novel adaptive optimization algorithms, { t FAdamET} and { t FAdamGT}, that integrate PT into Adam-based FL. We provide a rigorous convergence analysis of these algorithms under non-convex settings. Our experimental results demonstrate that both proposed algorithms consistently outperform existing methods when evaluating total communication cost and total computation cost across varying levels of data heterogeneity, showing the effectiveness of correcting first-order information in federated adaptive optimization.

Problem

Research questions and friction points this paper is trying to address.

Address data heterogeneity in Federated Learning

Extend Gradient Tracking to adaptive optimizers

Enhance convergence with Parameter Tracking in FL

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive optimization in FL

Parameter Tracking paradigm

Novel FAdamET and FAdamGT algorithms

🔎 Similar Papers

No similar papers found.