🤖 AI Summary
This paper addresses large-scale convex composite optimization—minimizing the sum of a smooth objective and a nonsmooth structured regularizer (e.g., ℓ₁ or group Lasso). We propose a novel “self-concordant smoothing” framework that unifies differentiable approximations of nonsmooth terms and induces variable metrics and adaptive step sizes tailored for proximal Newton-type methods. Our key contribution is Prox-GGN-SCORE, a Hessian-inverse-free proximal generalized Gauss–Newton algorithm, which enjoys global convergence guarantees under mild assumptions. Extensive experiments on synthetic and real-world datasets demonstrate that Prox-GGN-SCORE significantly outperforms state-of-the-art solvers—including Prox-Newton and L-BFGS-B—in both efficiency and scalability. To facilitate reproducibility and large-scale deployment, we release an open-source Julia package supporting both full-batch and mini-batch optimization.
📝 Abstract
We introduce a notion of self-concordant smoothing for minimizing the sum of two convex functions, one of which is smooth and the other may be nonsmooth. The key highlight of our approach is in a natural property of the resulting problem's structure which provides us with a variable-metric selection method and a step-length selection rule particularly suitable for proximal Newton-type algorithms. In addition, we efficiently handle specific structures promoted by the nonsmooth function, such as $ell_1$-regularization and group-lasso penalties. We prove the convergence of two resulting algorithms: Prox-N-SCORE, a proximal Newton algorithm and Prox-GGN-SCORE, a proximal generalized Gauss-Newton algorithm. The Prox-GGN-SCORE algorithm highlights an important approximation procedure which helps to significantly reduce most of the computational overhead associated with the inverse Hessian. This approximation is essentially useful for overparameterized machine learning models and in the mini-batch settings. Numerical examples on both synthetic and real datasets demonstrate the efficiency of our approach and its superiority over existing approaches. A Julia package implementing the proposed algorithms is available at https://github.com/adeyemiadeoye/SelfConcordantSmoothOptimization.jl.