Taming Preconditioner Drift: Unlocking the Potential of Second-Order Optimizers for Federated Learning on Non-IID Data

📅 2026-02-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In federated learning, second-order optimizers often become unstable or even diverge under non-IID data due to preconditioner drift. This work proposes FedPAC, the first framework to explicitly identify and address this issue by decoupling parameter aggregation from geometric synchronization. FedPAC introduces preconditioner alignment—achieved through a globally constructed reference and client-side warm-starting—and a correction mechanism that steers local updates along the global optimization direction, thereby explicitly aligning local preconditioners across clients. Theoretical analysis establishes convergence guarantees under non-convex settings, and extensive experiments on vision and language tasks demonstrate significant performance gains. Notably, on CIFAR-100 with a Vision Transformer (ViT), FedPAC achieves an absolute accuracy improvement of up to 5.8%.

Technology Category

Application Category

📝 Abstract
Second-order optimizers can significantly accelerate large-scale training, yet their naive federated variants are often unstable or even diverge on non-IID data. We show that a key culprit is \emph{preconditioner drift}: client-side second-order training induces heterogeneous \emph{curvature-defined geometries} (i.e., preconditioner coordinate systems), and server-side model averaging updates computed under incompatible metrics, corrupting the global descent direction. To address this geometric mismatch, we propose \texttt{FedPAC}, a \emph{preconditioner alignment and correction} framework for reliable federated second-order optimization. \texttt{FedPAC} explicitly decouples parameter aggregation from geometry synchronization by: (i) \textbf{Alignment} (i.e.,aggregating local preconditioners into a global reference and warm-starting clients via global preconditioner); and (ii) \textbf{Correction} (i.e., steering local preconditioned updates using a global preconditioned direction to suppress long-term drift). We provide drift-coupled non-convex convergence guarantees with linear speedup under partial participation. Empirically, \texttt{FedPAC} consistently improves stability and accuracy across vision and language tasks, achieving up to $5.8\%$ absolute accuracy gain on CIFAR-100 with ViTs. Code is available at https://anonymous.4open.science/r/FedPAC-8B24.
Problem

Research questions and friction points this paper is trying to address.

federated learning
second-order optimization
non-IID data
preconditioner drift
geometric mismatch
Innovation

Methods, ideas, or system contributions that make the work stand out.

preconditioner drift
second-order optimization
federated learning
non-IID data
geometry alignment
🔎 Similar Papers
No similar papers found.
J
Junkang Liu
College of Intelligence and Computing, Tianjin University, Tianjin, China
Fanhua Shang
Fanhua Shang
Professor at Tianjin University
Machine LearningData MiningComputer Vision
Hongying Liu
Hongying Liu
Tianjin University
Machine learningImage processing
Jin Liu
Jin Liu
School of Data Science, The Chinese University of Hong Kong, Shenzhen
Statistical Genetics/Genomics
W
Weixin An
School of Artificial Intelligence, Xidian University, Xi’an, China
Y
Yuanyuan Liu
School of Artificial Intelligence, Xidian University, Xi’an, China