🤖 AI Summary
Second-order optimization is hindered by the prohibitive computational cost of Hessian evaluation. This paper proposes an asynchronous, client-separated second-order optimization framework that, for the first time, decouples gradient and curvature (Hessian) computations across distinct clients and updates them asynchronously—eliminating manual hyperparameter tuning required by conventional lazy-Hessian approaches. Integrating cubic regularization with an inexact Hessian update mechanism, the framework effectively alleviates computational latency and communication bottlenecks in high-dimensional settings. We establish rigorous theoretical guarantees of strong convergence. Empirical evaluations on both synthetic and real-world datasets demonstrate consistent superiority over standard and lazy-Hessian baselines, achieving up to √τ real-time speedup while preserving the rapid convergence rate of second-order methods and enhancing practical training efficiency.
📝 Abstract
Second-order methods promise faster convergence but are rarely used in practice because Hessian computations and decompositions are far more expensive than gradients. We propose a emph{split-client} framework where gradients and curvature are computed asynchronously by separate clients. This abstraction captures realistic delays and inexact Hessian updates while avoiding the manual tuning required by Lazy Hessian methods. Focusing on cubic regularization, we show that our approach retains strong convergence guarantees and achieves a provable wall-clock speedup of order $sqrtτ$, where $τ$ is the relative time needed to compute and decompose the Hessian compared to a gradient step. Since $τ$ can be orders of magnitude larger than one in high-dimensional problems, this improvement is practically significant. Experiments on synthetic and real datasets confirm the theory: asynchronous curvature consistently outperforms vanilla and Lazy Hessian baselines, while maintaining second-order accuracy.