An Embarrassingly Simple Way to Optimize Orthogonal Matrices at Scale

📅 2026-02-16

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This work addresses the challenge of balancing computational efficiency and strict adherence to orthogonality constraints in large-scale optimization problems. The authors propose POGO, an algorithm built upon an enhanced Landing framework that integrates adaptive optimization principles. POGO enforces exact orthogonality throughout training using only five matrix multiplications per iteration, incurring negligible additional computational cost. By relying exclusively on GPU-friendly matrix operations, the method enables highly efficient parallelization while drastically reducing the number of hyperparameters. Experimental results demonstrate that POGO substantially outperforms existing approaches across multiple benchmarks, completing optimization tasks involving thousands of orthogonal matrices in minutes—a process that requires several hours with competing methods.

Technology Category

Application Category

📝 Abstract

Orthogonality constraints are ubiquitous in robust and probabilistic machine learning. Unfortunately, current optimizers are computationally expensive and do not scale to problems with hundreds or thousands of constraints. One notable exception is the Landing algorithm (Ablin et al., 2024) which, however comes at the expense of temporarily relaxing orthogonality. In this work, we revisit and improve on the ideas behind Landing, enabling the inclusion of modern adaptive optimizers while ensuring that orthogonal constraints are effectively met. Remarkably, these improvements come at little to no cost, and reduce the number of required hyperparemeters. Our algorithm POGO is fast and GPU-friendly, consisting of only 5 matrix products, and in practice maintains orthogonality at all times. On several challenging benchmarks, POGO greatly outperforms recent optimizers and shows it can optimize problems with thousands of orthogonal matrices in minutes while alternatives would take hours. As such, POGO sets a milestone to finally exploit orthogonality constraints in ML at scale. A PyTorch implementation of POGO is publicly available at https://github.com/adrianjav/pogo.

Problem

Research questions and friction points this paper is trying to address.

orthogonality constraints

scalable optimization

machine learning

orthogonal matrices

large-scale optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

orthogonal optimization

scalable algorithm

adaptive optimizer