A Normal Map-Based Proximal Stochastic Gradient Method: Convergence and Identification Properties

📅 2023-05-10
📈 Citations: 5
Influential: 1
📄 PDF
🤖 AI Summary
Proximal stochastic gradient descent (PSGD) struggles to identify underlying low-dimensional structures—such as support sets or low-rank manifolds—in stochastic composite optimization and lacks finite-time manifold identification guarantees. Method: We propose the Normalized Stochastic Gradient Descent (NSGD), a proximal stochastic method built upon Robinson’s normal map, designed for general nonconvex stochastic settings. Contributions/Results: NSGD is the first method to achieve finite-time active manifold identification and almost-sure convergence to stationary points without convexity assumptions or variance-reduction techniques. By integrating Kurdyka–Łojasiewicz inequality analysis with almost-sure iterative convergence theory, NSGD ensures global convergence to stable points with iteration complexity matching that of PSGD. Crucially, it identifies the active manifold exactly in finitely many steps with probability one—overcoming a fundamental structural identification limitation of conventional PSGD.
📝 Abstract
The proximal stochastic gradient method (PSGD) is one of the state-of-the-art approaches for stochastic composite-type problems. In contrast to its deterministic counterpart, PSGD has been found to have difficulties with the correct identification of underlying substructures (such as supports, low rank patterns, or active constraints) and it does not possess a finite-time manifold identification property. Existing solutions rely on convexity assumptions or on the additional usage of variance reduction techniques. In this paper, we address these limitations and present a simple variant of PSGD based on Robinson's normal map. The proposed normal map-based proximal stochastic gradient method (NSGD) is shown to converge globally, i.e., accumulation points of the generated iterates correspond to stationary points almost surely. In addition, we establish complexity bounds for NSGD that match the known results for PSGD and we prove that NSGD can almost surely identify active manifolds in finite-time in a general nonconvex setting. Our derivations are built on almost sure iterate convergence guarantees and utilize analysis techniques based on the Kurdyka-Lojasiewicz inequality.
Problem

Research questions and friction points this paper is trying to address.

PSGD struggles with identifying substructures in stochastic problems
Existing solutions require convexity or variance reduction techniques
Propose NSGD for global convergence and finite-time manifold identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Normal map-based proximal stochastic gradient method
Global convergence to stationary points
Finite-time active manifold identification
🔎 Similar Papers
No similar papers found.
J
Junwen Qiu
School of Data Science (SDS), The Chinese University of Hong Kong, Shenzhen, Shenzhen Research Institute of Big Data (SRIBD), Shenzhen, Guangdong, China
L
Li Jiang
Andre Milzarek
Andre Milzarek
Assistant Professor, The Chinese University of Hong Kong, Shenzhen
nonsmooth optimizationstochastic optimizationsecond order methodssecond order theory