FedCanon: Non-Convex Composite Federated Learning with Efficient Proximal Operation on Heterogeneous Data

📅 2025-04-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high per-client proximal computation overhead and severe convergence degradation caused by data heterogeneity in nonconvex composite federated learning (FL), this paper proposes FedCanon. Our method decouples local client updates from proximal operations, requiring only a single proximal evaluation per round at the server; it further introduces a control variate mechanism incorporating the global gradient to explicitly model and mitigate data bias. Theoretically, FedCanon is the first algorithm for nonconvex composite FL achieving single-server-proximal-evaluation per round, guaranteeing sublinear convergence without bounded heterogeneity assumptions—and linear convergence under the Polyak–Łojasiewicz (PL) condition. Empirically, FedCanon outperforms state-of-the-art methods across multiple heterogeneous benchmarks: it improves model accuracy, reduces proximal computation cost by 3–5×, decreases communication rounds by 20%–35%, and yields more stable convergence.

Technology Category

Application Category

📝 Abstract
Composite federated learning offers a general framework for solving machine learning problems with additional regularization terms. However, many existing methods require clients to perform multiple proximal operations to handle non-smooth terms and their performance are often susceptible to data heterogeneity. To overcome these limitations, we propose a novel composite federated learning algorithm called extbf{FedCanon}, designed to solve the optimization problems comprising a possibly non-convex loss function and a weakly convex, potentially non-smooth regularization term. By decoupling proximal mappings from local updates, FedCanon requires only a single proximal evaluation on the server per iteration, thereby reducing the overall proximal computation cost. It also introduces control variables that incorporate global gradient information into client updates, which helps mitigate the effects of data heterogeneity. Theoretical analysis demonstrates that FedCanon achieves sublinear convergence rates under general non-convex settings and linear convergence under the Polyak-{L}ojasiewicz condition, without relying on bounded heterogeneity assumptions. Experiments demonstrate that FedCanon outperforms the state-of-the-art methods in terms of both accuracy and computational efficiency, particularly under heterogeneous data distributions.
Problem

Research questions and friction points this paper is trying to address.

Efficient proximal operation in non-convex composite federated learning
Mitigating data heterogeneity effects in federated learning
Reducing proximal computation cost with single server evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples proximal mappings from local updates
Introduces control variables for global gradient integration
Achieves sublinear and linear convergence rates
🔎 Similar Papers
No similar papers found.
Y
Yuan Zhou
School of Cyber Science and Engineering, Southeast University, Nanjing 210096, China
J
Jiachen Zhong
School of Cyber Science and Engineering, Southeast University, Nanjing 210096, China
Xinli Shi
Xinli Shi
ARC DECRA Fellow
Distributed LearningMulti-Agent Reinforcement LearningMPC
G
Guanghui Wen
School of Automation, Southeast University, Nanjing 210096, China
Xinghuo Yu
Xinghuo Yu
FAA, FIFAC, HonFIEAust, FIEEE, FAICD, Distinguished Professor @ RMIT University
Sliding Mode ControlControl SystemsComplex NetworksSmart GridsIntelligent Systems