Why Federated Optimization Fails to Achieve Perfect Fitting? A Theoretical Perspective on Client-Side Optima

📅 2025-11-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper identifies the fundamental mechanism behind performance degradation in federated optimization under data heterogeneity: discrepancies among clients’ local optima elevate the lower bound of the global objective function, rendering perfect global fit infeasible and causing the global model to converge to an oscillatory region rather than a fixed point. Method: Grounded in distributed optimization theory, we establish the first rigorous analytical link between local optimal divergence and global convergence behavior. Our approach integrates theoretical derivation with empirical validation across diverse tasks and model architectures, and we open-source a unified framework, FedTorch. Contribution/Results: We provide a verifiable theoretical explanation for federated learning’s performance degradation. We prove—both theoretically and empirically—that global models cannot perfectly fit all client data under heterogeneity, and that convergence oscillation is an inherent, provable phenomenon. This work offers a novel theoretical perspective and a testable foundation for federated optimization.

Technology Category

Application Category

📝 Abstract
Federated optimization is a constrained form of distributed optimization that enables training a global model without directly sharing client data. Although existing algorithms can guarantee convergence in theory and often achieve stable training in practice, the reasons behind performance degradation under data heterogeneity remain unclear. To address this gap, the main contribution of this paper is to provide a theoretical perspective that explains why such degradation occurs. We introduce the assumption that heterogeneous client data lead to distinct local optima, and show that this assumption implies two key consequences: 1) the distance among clients' local optima raises the lower bound of the global objective, making perfect fitting of all client data impossible; and 2) in the final training stage, the global model oscillates within a region instead of converging to a single optimum, limiting its ability to fully fit the data. These results provide a principled explanation for performance degradation in non-iid settings, which we further validate through experiments across multiple tasks and neural network architectures. The framework used in this paper is open-sourced at: https://github.com/NPCLEI/fedtorch.
Problem

Research questions and friction points this paper is trying to address.

Explains performance degradation in federated learning under data heterogeneity
Shows client-side optima prevent perfect fitting of all data
Demonstrates global model oscillation limits convergence to optimum
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes client-side optima in federated learning
Proves impossibility of perfect data fitting
Explains global model oscillation in training
🔎 Similar Papers
No similar papers found.
Z
Zhongxiang Lei
Beijing Institute of Technology
Q
Qi Yang
Beijing Institute of Technology
P
Ping Qiu
Beijing Institute of Technology
Gang Zhang
Gang Zhang
Tsinghua University
computer vision
Y
Yuanchi Ma
Beijing Institute of Technology
J
Jinyan Liu
Beijing Institute of Technology