Efficient Optimization with Orthogonality Constraint: a Randomized Riemannian Submanifold Method

📅 2025-05-18

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

To address the prohibitively high computational cost of Riemannian retractions in large-scale optimization under orthogonality constraints, this paper proposes Random Submanifold Optimization (RSMO). At each iteration, RSMO performs tangent-space search and employs a simplified retraction exclusively on a low-dimensional random submanifold, drastically reducing per-iteration complexity. The method supports two efficient sampling strategies and, for the first time, establishes convergence guarantees under non-convex, Riemannian Polyak–Łojasiewicz (PL), and stochastic settings. Moreover, it naturally extends to quotient manifolds induced by orthogonal groups. Experiments on large-scale tasks—including matrix completion, principal component analysis (PCA), and orthogonalization of Transformer weights—demonstrate that RSMO achieves 2–5× speedup over state-of-the-art methods while preserving solution accuracy, thereby significantly enhancing the scalability of orthogonally constrained optimization.

Technology Category

Application Category

📝 Abstract

Optimization with orthogonality constraints frequently arises in various fields such as machine learning. Riemannian optimization offers a powerful framework for solving these problems by equipping the constraint set with a Riemannian manifold structure and performing optimization intrinsically on the manifold. This approach typically involves computing a search direction in the tangent space and updating variables via a retraction operation. However, as the size of the variables increases, the computational cost of the retraction can become prohibitively high, limiting the applicability of Riemannian optimization to large-scale problems. To address this challenge and enhance scalability, we propose a novel approach that restricts each update on a random submanifold, thereby significantly reducing the per-iteration complexity. We introduce two sampling strategies for selecting the random submanifolds and theoretically analyze the convergence of the proposed methods. We provide convergence results for general nonconvex functions and functions that satisfy Riemannian Polyak-Lojasiewicz condition as well as for stochastic optimization settings. Additionally, we demonstrate how our approach can be generalized to quotient manifolds derived from the orthogonal manifold. Extensive experiments verify the benefits of the proposed method, across a wide variety of problems.

Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost in large-scale Riemannian optimization

Enhancing scalability via randomized submanifold updates

Generalizing approach to quotient manifolds for orthogonality constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Randomized Riemannian submanifold method for efficiency

Sampling strategies to reduce per-iteration complexity

Generalization to quotient manifolds from orthogonal manifold

🔎 Similar Papers

No similar papers found.