Smoothing ADMM for Non-convex and Non-smooth Hierarchical Federated Learning

๐Ÿ“… 2025-03-11
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper addresses key challenges in hierarchical federated learning (HFL) under nonconvex and nonsmooth objectives: difficulty in modeling statistical and system heterogeneity, rigid update mechanisms, and lack of hierarchy-aware regularization. To this end, we propose a smoothed alternating direction method of multipliers (ADMM) framework. Methodologically, we are the first to introduce smoothed ADMM into HFL, enabling asynchronous updates and multiple local iterations per round; we further formulate a hierarchical constraint modelโ€”employing total variation norm for global consensus and nonconvex penalties (e.g., MCP, SCAD) for personalized learning. Our contributions include: (i) rigorous convergence guarantees under standard assumptions; (ii) significant improvements in convergence speed and test accuracy on mainstream FL benchmarks; and (iii) enhanced robustness and generalization against data distribution shifts and heterogeneous device resources.

Technology Category

Application Category

๐Ÿ“ Abstract
This paper presents a hierarchical federated learning (FL) framework that extends the alternating direction method of multipliers (ADMM) with smoothing techniques, tailored for non-convex and non-smooth objectives. Unlike traditional hierarchical FL methods, our approach supports asynchronous updates and multiple updates per iteration, enhancing adaptability to heterogeneous data and system settings. Additionally, we introduce a flexible mechanism to leverage diverse regularization functions at each layer, allowing customization to the specific prior information within each cluster and accommodating (possibly) non-smooth penalty objectives. Depending on the learning goal, the framework supports both consensus and personalization: the total variation norm can be used to enforce consensus across layers, while non-convex penalties such as minimax concave penalty (MCP) or smoothly clipped absolute deviation (SCAD) enable personalized learning. Experimental results demonstrate the superior convergence rates and accuracy of our method compared to conventional approaches, underscoring its robustness and versatility for a wide range of FL scenarios.
Problem

Research questions and friction points this paper is trying to address.

Extends ADMM for non-convex, non-smooth hierarchical FL.
Supports asynchronous updates and multiple updates per iteration.
Enables flexible regularization for consensus and personalization.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Smoothing ADMM for non-convex, non-smooth objectives
Asynchronous updates and multiple updates per iteration
Flexible regularization for consensus and personalization
๐Ÿ”Ž Similar Papers
No similar papers found.