A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization

📅 2023-12-02

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 1

career value

224K/year

🤖 AI Summary

Existing stochastic reshuffling methods for nonsmooth nonconvex finite-sum optimization lack theoretical foundations and rely on smoothness assumptions. Method: We propose the norm-based proximal random reshuffling (norm-PRR) algorithm, which leverages normal maps and proximal operators within a stochastic reshuffling framework. Contribution/Results: This is the first method to establish an $O(n^{-1/3}T^{-2/3})$ iteration complexity bound for nonsmooth nonconvex finite-sum problems—improving upon prior bounds by a factor of $n^{-1/3}$. We prove linear convergence under the Polyak–Łojasiewicz (PL) condition and almost sure convergence of the iterate sequence; moreover, we derive a last-iterate convergence rate matching that of smooth strongly convex settings. Our analysis integrates normal maps, proximal operators, stochastic reshuffling, and PL/Kurdyka–Łojasiewicz (KL) inequalities, bridging non-asymptotic and asymptotic convergence analyses. Numerical experiments demonstrate norm-PRR’s efficiency and robustness on nonsmooth nonconvex classification tasks.

📝 Abstract

Random reshuffling techniques are prevalent in large-scale applications, such as training neural networks. While the convergence and acceleration effects of random reshuffling-type methods are fairly well understood in the smooth setting, much less studies seem available in the nonsmooth case. In this work, we design a new normal map-based proximal random reshuffling (norm-PRR) method for nonsmooth nonconvex finite-sum problems. We show that norm-PRR achieves the iteration complexity $O(n^{-1/3}T^{-2/3})$ where $n$ denotes the number of component functions $f(cdot,i)$ and $T$ counts the total number of iterations. This improves the currently known complexity bounds for this class of problems by a factor of $n^{-1/3}$. In addition, we prove that norm-PRR converges linearly under the (global) Polyak-Lojasiewicz condition and in the interpolation setting. We further complement these non-asymptotic results and provide an in-depth analysis of the asymptotic properties of norm-PRR. Specifically, under the (local) Kurdyka-Lojasiewicz inequality, the whole sequence of iterates generated by norm-PRR is shown to converge to a single stationary point. Moreover, we derive last iterate convergence rates that can match those in the smooth, strongly convex setting. Finally, numerical experiments are performed on nonconvex classification tasks to illustrate the efficiency of the proposed approach.

Problem

Research questions and friction points this paper is trying to address.

Develops norm-PRR for nonsmooth nonconvex finite-sum optimization

Improves iteration complexity by factor n^{-1/3} for gradients

Proves linear convergence under Polyak-Łojasiewicz condition

Innovation

Methods, ideas, or system contributions that make the work stand out.

New normal map-based proximal random reshuffling method

Improved iteration complexity for nonsmooth nonconvex problems

Linear convergence under Polyak-Łojasiewicz condition

🔎 Similar Papers

No similar papers found.