🤖 AI Summary
This work addresses stochastic optimization of nonsmooth *tame* functions on Riemannian manifolds, motivated by training challenges in deep learning arising from geometric constraints and nondifferentiable components. We propose a reparameterized stochastic gradient descent (SGD) algorithm incorporating a contraction mapping to guarantee that all iterates remain strictly on the manifold. For the first time, we unify tame geometry theory with Riemannian stochastic optimization, rigorously linking the subdifferential properties of tame functions on manifolds to the convergence behavior of SGD. Under mild regularity assumptions—specifically, requiring only weak continuity of the generalized gradient—we establish almost-sure global convergence of the algorithm for general nonsmooth objectives under diminishing step sizes. This provides the first theoretically rigorous and practically implementable convergence framework for modern machine learning models involving both geometric constraints and nonsmooth structures.
📝 Abstract
In many learning applications, the parameters in a model are structurally constrained in a way that can be modeled as them lying on a Riemannian manifold. Riemannian optimization, wherein procedures to enforce an iterative minimizing sequence to be constrained to the manifold, is used to train such models. At the same time, tame geometry has become a significant topological description of nonsmooth functions that appear in the landscapes of training neural networks and other important models with structural compositions of continuous nonlinear functions with nonsmooth maps. In this paper, we study the properties of such stratifiable functions on a manifold and the behavior of retracted stochastic gradient descent, with diminishing stepsizes, for minimizing such functions.