Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting

📅 2025-06-14

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses a theoretical bottleneck in shuffling-based variance-reduction methods for strongly convex stochastic optimization, aiming to close the gap between uniform sampling and shuffling in gradient complexity analysis. We propose a dynamic weighted gradient update scheme that integrates SARAH-type variance reduction with shuffling sampling, and design an Inexact variant that avoids full-batch gradient computations. For the first time, our method achieves the optimal gradient complexity $O(n + sqrt{n/varepsilon})$ for *any* shuffling order. The Inexact variant preserves linear convergence while reducing total computational complexity from $O(n/varepsilon)$ to significantly lower levels. The core innovation lies in unifying enhanced exploration capability with improved convergence efficiency, thereby establishing a tighter and more general theoretical foundation for shuffling-based algorithms.

Technology Category

Application Category

📝 Abstract

In this paper, we propose Adjusted Shuffling SARAH, a novel algorithm that integrates shuffling techniques with the well-known variance-reduced algorithm SARAH while dynamically adjusting the stochastic gradient weights in each update to enhance exploration. Our method achieves the best-known gradient complexity for shuffling variance reduction methods in a strongly convex setting. This result applies to any shuffling technique, which narrows the gap in the complexity analysis of variance reduction methods between uniform sampling and shuffling data. Furthermore, we introduce Inexact Adjusted Reshuffling SARAH, an inexact variant of Adjusted Shuffling SARAH that eliminates the need for full-batch gradient computations. This algorithm retains the same linear convergence rate as Adjusted Shuffling SARAH while showing an advantage in total complexity when the sample size is very large.

Problem

Research questions and friction points this paper is trying to address.

Enhance exploration via dynamic gradient weighting

Achieve best-known gradient complexity for shuffling methods

Eliminate full-batch gradient computations in large datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic gradient weighting enhances exploration

Best-known gradient complexity for shuffling

Inexact variant eliminates full-batch computations

🔎 Similar Papers

No similar papers found.