A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization

📅 2023-12-02
🏛️ arXiv.org
📈 Citations: 3
Influential: 1
📄 PDF
🤖 AI Summary
Existing stochastic reshuffling methods for nonsmooth nonconvex finite-sum optimization lack theoretical foundations and rely on smoothness assumptions. Method: We propose the norm-based proximal random reshuffling (norm-PRR) algorithm, which leverages normal maps and proximal operators within a stochastic reshuffling framework. Contribution/Results: This is the first method to establish an $O(n^{-1/3}T^{-2/3})$ iteration complexity bound for nonsmooth nonconvex finite-sum problems—improving upon prior bounds by a factor of $n^{-1/3}$. We prove linear convergence under the Polyak–Łojasiewicz (PL) condition and almost sure convergence of the iterate sequence; moreover, we derive a last-iterate convergence rate matching that of smooth strongly convex settings. Our analysis integrates normal maps, proximal operators, stochastic reshuffling, and PL/Kurdyka–Łojasiewicz (KL) inequalities, bridging non-asymptotic and asymptotic convergence analyses. Numerical experiments demonstrate norm-PRR’s efficiency and robustness on nonsmooth nonconvex classification tasks.
📝 Abstract
Random reshuffling techniques are prevalent in large-scale applications, such as training neural networks. While the convergence and acceleration effects of random reshuffling-type methods are fairly well understood in the smooth setting, much less studies seem available in the nonsmooth case. In this work, we design a new normal map-based proximal random reshuffling (norm-PRR) method for nonsmooth nonconvex finite-sum problems. We show that norm-PRR achieves the iteration complexity $O(n^{-1/3}T^{-2/3})$ where $n$ denotes the number of component functions $f(cdot,i)$ and $T$ counts the total number of iterations. This improves the currently known complexity bounds for this class of problems by a factor of $n^{-1/3}$. In addition, we prove that norm-PRR converges linearly under the (global) Polyak-Lojasiewicz condition and in the interpolation setting. We further complement these non-asymptotic results and provide an in-depth analysis of the asymptotic properties of norm-PRR. Specifically, under the (local) Kurdyka-Lojasiewicz inequality, the whole sequence of iterates generated by norm-PRR is shown to converge to a single stationary point. Moreover, we derive last iterate convergence rates that can match those in the smooth, strongly convex setting. Finally, numerical experiments are performed on nonconvex classification tasks to illustrate the efficiency of the proposed approach.
Problem

Research questions and friction points this paper is trying to address.

Develops norm-PRR for nonsmooth nonconvex finite-sum optimization
Improves iteration complexity by factor n^{-1/3} for gradients
Proves linear convergence under Polyak-Łojasiewicz condition
Innovation

Methods, ideas, or system contributions that make the work stand out.

New normal map-based proximal random reshuffling method
Improved iteration complexity for nonsmooth nonconvex problems
Linear convergence under Polyak-Łojasiewicz condition
🔎 Similar Papers
No similar papers found.
X
Xiao Li
School of Data Science, The Chinese University of Hong Kong, Shenzhen
Andre Milzarek
Andre Milzarek
Assistant Professor, The Chinese University of Hong Kong, Shenzhen
nonsmooth optimizationstochastic optimizationsecond order methodssecond order theory
J
Junwen Qiu
School of Data Science, The Chinese University of Hong Kong, Shenzhen