Function-Space ADMM for Decentralized Federated Learning: A Control Theoretic Perspective

📅 2026-05-10

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the performance degradation in decentralized federated learning caused by non-independent and identically distributed (non-IID) data. It introduces, for the first time, the Alternating Direction Method of Multipliers (ADMM) into function space, leveraging the convexity of the loss functional to construct update directions and projecting them back into parameter space via knowledge distillation. Further, from a control-theoretic perspective, a stability coefficient based on a proportional-integral (PI) mechanism is designed to enhance algorithmic convergence and model consistency across devices. Under extreme non-IID settings, the proposed method significantly outperforms existing decentralized federated learning approaches, achieving faster and more stable convergence as well as higher accuracy.

📝 Abstract

Decentralized federated learning (FL) is a promising approach for training machine learning models on sensor networks, Internet of Things (IoT) devices, and other edge systems where no central server exists. While federated learning offers advantages such as preserving data privacy, it often suffers from non-independent and identically distributed (IID) data distributions across devices, which cause significant performance degradation. This issue is particularly severe when directly optimizing model parameters, because neural network training is inherently non-convex and standard convergence guarantees for convex optimization do not apply. Unlike existing decentralized FL methods that primarily operate in parameter space, we propose federated function-space alternating direction method of multipliers (FedF-ADMM). FedF-ADMM exploits the convexity of loss functionals within function space to derive alternating direction method of multipliers (ADMM)-based update directions, which are subsequently projected onto the parameter space via knowledge distillation. We further introduce a stabilization coefficient to enhance robustness under severe non-IID settings and analyze its behavior from a control-theoretic perspective by interpreting it as a proportional-integral (PI) term. Experiments under challenging non-IID scenarios, including settings where each device has data from only a single label, demonstrate that FedF-ADMM achieves faster and more stable convergence than existing decentralized FL methods, while attaining higher accuracy and better consensus among devices.

Problem

Research questions and friction points this paper is trying to address.

decentralized federated learning

non-IID data

function-space optimization

convergence instability

model consensus

Innovation

Methods, ideas, or system contributions that make the work stand out.

function-space optimization

decentralized federated learning

ADMM