FedCanon: Non-Convex Composite Federated Learning with Efficient Proximal Operation on Heterogeneous Data

📅 2025-04-16

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

To address the high per-client proximal computation overhead and severe convergence degradation caused by data heterogeneity in nonconvex composite federated learning (FL), this paper proposes FedCanon. Our method decouples local client updates from proximal operations, requiring only a single proximal evaluation per round at the server; it further introduces a control variate mechanism incorporating the global gradient to explicitly model and mitigate data bias. Theoretically, FedCanon is the first algorithm for nonconvex composite FL achieving single-server-proximal-evaluation per round, guaranteeing sublinear convergence without bounded heterogeneity assumptions—and linear convergence under the Polyak–Łojasiewicz (PL) condition. Empirically, FedCanon outperforms state-of-the-art methods across multiple heterogeneous benchmarks: it improves model accuracy, reduces proximal computation cost by 3–5×, decreases communication rounds by 20%–35%, and yields more stable convergence.

Technology Category

Application Category

📝 Abstract

Composite federated learning offers a general framework for solving machine learning problems with additional regularization terms. However, many existing methods require clients to perform multiple proximal operations to handle non-smooth terms and their performance are often susceptible to data heterogeneity. To overcome these limitations, we propose a novel composite federated learning algorithm called extbf{FedCanon}, designed to solve the optimization problems comprising a possibly non-convex loss function and a weakly convex, potentially non-smooth regularization term. By decoupling proximal mappings from local updates, FedCanon requires only a single proximal evaluation on the server per iteration, thereby reducing the overall proximal computation cost. It also introduces control variables that incorporate global gradient information into client updates, which helps mitigate the effects of data heterogeneity. Theoretical analysis demonstrates that FedCanon achieves sublinear convergence rates under general non-convex settings and linear convergence under the Polyak-{L}ojasiewicz condition, without relying on bounded heterogeneity assumptions. Experiments demonstrate that FedCanon outperforms the state-of-the-art methods in terms of both accuracy and computational efficiency, particularly under heterogeneous data distributions.

Problem

Research questions and friction points this paper is trying to address.

Efficient proximal operation in non-convex composite federated learning

Mitigating data heterogeneity effects in federated learning

Reducing proximal computation cost with single server evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples proximal mappings from local updates

Introduces control variables for global gradient integration

Achieves sublinear and linear convergence rates

🔎 Similar Papers

A Hierarchical Federated Learning Approach for the Internet of Things