TRSVR: An Adaptive Stochastic Trust-Region Method with Variance Reduction

📅 2026-01-21

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses the slow convergence and reliance on function evaluations of stochastic gradient methods in unconstrained nonconvex optimization by proposing a novel algorithm that integrates stochastic variance-reduced gradient (SVRG) with an adaptive stochastic trust-region framework. The method requires only stochastic gradient information—eliminating the need for explicit function value computations—and enhances efficiency through an adaptive trust-region radius adjustment. Notably, it is the first to combine SVRG with an adaptive trust-region mechanism, enabling the incorporation of stochastic (potentially gradient-dependent) second-order information. Theoretical analysis establishes that the algorithm converges in expectation to a first-order stationary point, achieving iteration and sample complexities comparable to state-of-the-art SVRG methods. Empirical results demonstrate its superior performance over SGD and Adam across multiple machine learning tasks.

Technology Category

Application Category

📝 Abstract

We propose a stochastic trust-region method for unconstrained nonconvex optimization that incorporates stochastic variance-reduced gradients (SVRG) to accelerate convergence. Unlike classical trust-region methods, the proposed algorithm relies solely on stochastic gradient information and does not require function value evaluations. The trust-region radius is adaptively adjusted based on a radius-control parameter and the stochastic gradient estimate. Under mild assumptions, we establish that the algorithm converges in expectation to a first-order stationary point. Moreover, the method achieves iteration and sample complexity bounds that match those of SVRG-based first-order methods, while allowing stochastic and potentially gradient-dependent second-order information. Extensive numerical experiments demonstrate that incorporating SVRG accelerates convergence, and that the use of trust-region methods and Hessian information further improves performance. We also highlight the impact of batch size and inner-loop length on efficiency, and show that the proposed method outperforms SGD and Adam on several machine learning tasks.

Problem

Research questions and friction points this paper is trying to address.

nonconvex optimization

stochastic trust-region

variance reduction

first-order stationary point

stochastic gradient

Innovation

Methods, ideas, or system contributions that make the work stand out.

stochastic trust-region

variance reduction

SVRG