Why Federated Optimization Fails to Achieve Perfect Fitting? A Theoretical Perspective on Client-Side Optima

📅 2025-11-01

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This paper identifies the fundamental mechanism behind performance degradation in federated optimization under data heterogeneity: discrepancies among clients’ local optima elevate the lower bound of the global objective function, rendering perfect global fit infeasible and causing the global model to converge to an oscillatory region rather than a fixed point. Method: Grounded in distributed optimization theory, we establish the first rigorous analytical link between local optimal divergence and global convergence behavior. Our approach integrates theoretical derivation with empirical validation across diverse tasks and model architectures, and we open-source a unified framework, FedTorch. Contribution/Results: We provide a verifiable theoretical explanation for federated learning’s performance degradation. We prove—both theoretically and empirically—that global models cannot perfectly fit all client data under heterogeneity, and that convergence oscillation is an inherent, provable phenomenon. This work offers a novel theoretical perspective and a testable foundation for federated optimization.

Technology Category

Application Category

📝 Abstract

Federated optimization is a constrained form of distributed optimization that enables training a global model without directly sharing client data. Although existing algorithms can guarantee convergence in theory and often achieve stable training in practice, the reasons behind performance degradation under data heterogeneity remain unclear. To address this gap, the main contribution of this paper is to provide a theoretical perspective that explains why such degradation occurs. We introduce the assumption that heterogeneous client data lead to distinct local optima, and show that this assumption implies two key consequences: 1) the distance among clients' local optima raises the lower bound of the global objective, making perfect fitting of all client data impossible; and 2) in the final training stage, the global model oscillates within a region instead of converging to a single optimum, limiting its ability to fully fit the data. These results provide a principled explanation for performance degradation in non-iid settings, which we further validate through experiments across multiple tasks and neural network architectures. The framework used in this paper is open-sourced at: https://github.com/NPCLEI/fedtorch.

Problem

Research questions and friction points this paper is trying to address.

Explains performance degradation in federated learning under data heterogeneity

Shows client-side optima prevent perfect fitting of all data

Demonstrates global model oscillation limits convergence to optimum

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes client-side optima in federated learning

Proves impossibility of perfect data fitting

Explains global model oscillation in training

🔎 Similar Papers

Emulating Full Participation: An Effective and Fair Client Selection Strategy for Federated Learning