Efficient Optimization with Orthogonality Constraint: a Randomized Riemannian Submanifold Method

📅 2025-05-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the prohibitively high computational cost of Riemannian retractions in large-scale optimization under orthogonality constraints, this paper proposes Random Submanifold Optimization (RSMO). At each iteration, RSMO performs tangent-space search and employs a simplified retraction exclusively on a low-dimensional random submanifold, drastically reducing per-iteration complexity. The method supports two efficient sampling strategies and, for the first time, establishes convergence guarantees under non-convex, Riemannian Polyak–Łojasiewicz (PL), and stochastic settings. Moreover, it naturally extends to quotient manifolds induced by orthogonal groups. Experiments on large-scale tasks—including matrix completion, principal component analysis (PCA), and orthogonalization of Transformer weights—demonstrate that RSMO achieves 2–5× speedup over state-of-the-art methods while preserving solution accuracy, thereby significantly enhancing the scalability of orthogonally constrained optimization.

Technology Category

Application Category

📝 Abstract
Optimization with orthogonality constraints frequently arises in various fields such as machine learning. Riemannian optimization offers a powerful framework for solving these problems by equipping the constraint set with a Riemannian manifold structure and performing optimization intrinsically on the manifold. This approach typically involves computing a search direction in the tangent space and updating variables via a retraction operation. However, as the size of the variables increases, the computational cost of the retraction can become prohibitively high, limiting the applicability of Riemannian optimization to large-scale problems. To address this challenge and enhance scalability, we propose a novel approach that restricts each update on a random submanifold, thereby significantly reducing the per-iteration complexity. We introduce two sampling strategies for selecting the random submanifolds and theoretically analyze the convergence of the proposed methods. We provide convergence results for general nonconvex functions and functions that satisfy Riemannian Polyak-Lojasiewicz condition as well as for stochastic optimization settings. Additionally, we demonstrate how our approach can be generalized to quotient manifolds derived from the orthogonal manifold. Extensive experiments verify the benefits of the proposed method, across a wide variety of problems.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost in large-scale Riemannian optimization
Enhancing scalability via randomized submanifold updates
Generalizing approach to quotient manifolds for orthogonality constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Randomized Riemannian submanifold method for efficiency
Sampling strategies to reduce per-iteration complexity
Generalization to quotient manifolds from orthogonal manifold
🔎 Similar Papers
No similar papers found.