🤖 AI Summary
This work addresses a theoretical bottleneck in shuffling-based variance-reduction methods for strongly convex stochastic optimization, aiming to close the gap between uniform sampling and shuffling in gradient complexity analysis. We propose a dynamic weighted gradient update scheme that integrates SARAH-type variance reduction with shuffling sampling, and design an Inexact variant that avoids full-batch gradient computations. For the first time, our method achieves the optimal gradient complexity $O(n + sqrt{n/varepsilon})$ for *any* shuffling order. The Inexact variant preserves linear convergence while reducing total computational complexity from $O(n/varepsilon)$ to significantly lower levels. The core innovation lies in unifying enhanced exploration capability with improved convergence efficiency, thereby establishing a tighter and more general theoretical foundation for shuffling-based algorithms.
📝 Abstract
In this paper, we propose Adjusted Shuffling SARAH, a novel algorithm that integrates shuffling techniques with the well-known variance-reduced algorithm SARAH while dynamically adjusting the stochastic gradient weights in each update to enhance exploration. Our method achieves the best-known gradient complexity for shuffling variance reduction methods in a strongly convex setting. This result applies to any shuffling technique, which narrows the gap in the complexity analysis of variance reduction methods between uniform sampling and shuffling data. Furthermore, we introduce Inexact Adjusted Reshuffling SARAH, an inexact variant of Adjusted Shuffling SARAH that eliminates the need for full-batch gradient computations. This algorithm retains the same linear convergence rate as Adjusted Shuffling SARAH while showing an advantage in total complexity when the sample size is very large.