SAPPHIRE: Preconditioned Stochastic Variance Reduction for Faster Large-Scale Statistical Learning

📅 2025-01-27

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

In large-scale statistical learning, ill-conditioned objective functions and nonsmooth regularizers lead to slow convergence and high computational cost. Method: This paper proposes a variance-reduced stochastic optimization algorithm that integrates sketching-based preconditioning with scaled proximal mappings. Technically, it unifies sketching, SAGA/SVRG-type variance reduction, preconditioned gradient methods, and composite convex optimization theory. Contribution/Results: The method achieves the first condition-number-independent linear convergence rate for problems exhibiting both ill-conditioning and nonsmoothness, while remaining robust and efficient under nonconvex objectives and sparse updates. Empirical evaluation on Lasso and logistic regression tasks demonstrates approximately 20× speedup over Catalyst, SAGA, and SVRG, significantly improving the efficiency of solving large-scale ill-conditioned learning problems.

Technology Category

Application Category

📝 Abstract

Regularized empirical risk minimization (rERM) has become important in data-intensive fields such as genomics and advertising, with stochastic gradient methods typically used to solve the largest problems. However, ill-conditioned objectives and non-smooth regularizers undermine the performance of traditional stochastic gradient methods, leading to slow convergence and significant computational costs. To address these challenges, we propose the $ exttt{SAPPHIRE}$ ($ extbf{S}$ketching-based $ extbf{A}$pproximations for $ extbf{P}$roximal $ extbf{P}$reconditioning and $ extbf{H}$essian $ extbf{I}$nexactness with Variance-$ extbf{RE}$educed Gradients) algorithm, which integrates sketch-based preconditioning to tackle ill-conditioning and uses a scaled proximal mapping to minimize the non-smooth regularizer. This stochastic variance-reduced algorithm achieves condition-number-free linear convergence to the optimum, delivering an efficient and scalable solution for ill-conditioned composite large-scale convex machine learning problems. Extensive experiments on lasso and logistic regression demonstrate that $ exttt{SAPPHIRE}$ often converges $20$ times faster than other common choices such as $ exttt{Catalyst}$, $ exttt{SAGA}$, and $ exttt{SVRG}$. This advantage persists even when the objective is non-convex or the preconditioner is infrequently updated, highlighting its robust and practical effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Regularized Empirical Risk Minimization

Stochastic Gradient Methods

Large-scale Data

Innovation

Methods, ideas, or system contributions that make the work stand out.

SAPPHIRE algorithm

Regularized Empirical Risk Minimization

Acceleration of large-scale data analysis

🔎 Similar Papers

Towards One Model for Classical Dimensionality Reduction: A Probabilistic Perspective on UMAP and t-SNE