A Muon-Accelerated Algorithm for Low Separation Rank Tensor Generalized Linear Models

📅 2026-04-06

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the challenges posed by vectorization in high-dimensional tensor generalized linear models, which destroys inherent multilinear structures and leads to ill-posed estimation. To overcome these issues, the authors propose a low separation rank (LSR) tensor generalized linear model (LSR-TGLM) that preserves the intrinsic structure of tensor coefficients through LSR decomposition. An efficient block coordinate descent algorithm is developed, featuring a novel Muon momentum update mechanism based on Newton–Schulz orthogonalization to replace conventional QR projection, significantly accelerating convergence. Empirical evaluations on synthetic linear, logistic, and Poisson regression tasks, as well as 3D Vessel MNIST classification, demonstrate that the proposed method achieves lower estimation and prediction errors with fewer iterations and reduced runtime, while maintaining superior classification performance.

Technology Category

Application Category

📝 Abstract

Tensor-valued data arise naturally in multidimensional signal and imaging problems, such as biomedical imaging. When incorporated into generalized linear models (GLMs), naive vectorization can destroy their multi-way structure and lead to high-dimensional, ill-posed estimation. To address this challenge, Low Separation Rank (LSR) decompositions reduce model complexity by imposing low-rank multilinear structure on the coefficient tensor. A representative approach for estimating LSR-based tensor GLMs (LSR-TGLMs) is the Low Separation Rank Tensor Regression (LSRTR) algorithm, which adopts block coordinate descent and enforces orthogonality of the factor matrices through repeated QR-based projections. However, the repeated projection steps can be computationally demanding and slow convergence. Motivated by the need for scalable estimation and classification from such data, we propose LSRTR-M, which incorporates Muon (MomentUm Orthogonalized by Newton-Schulz) updates into the LSRTR framework. Specifically, LSRTR-M preserves the original block coordinate scheme while replacing the projection-based factor updates with Muon steps. Across synthetic linear, logistic, and Poisson LSR-TGLMs, LSRTR-M converges faster in both iteration count and wall-clock time, while achieving lower normalized estimation and prediction errors. On the Vessel MNIST 3D task, it further improves computational efficiency while maintaining competitive classification performance.

Problem

Research questions and friction points this paper is trying to address.

tensor-valued data

generalized linear models

low separation rank

computational efficiency

orthogonality constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Muon

Low Separation Rank

Tensor GLM