Implicit Bias and Convergence of Matrix Stochastic Mirror Descent

📅 2026-02-21

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work investigates the convergence properties and implicit bias of matrix-form stochastic mirror descent (SMD) in over-parameterized, multi-output high-dimensional problems such as multiclass classification and matrix completion. By extending the implicit bias theory of vector-valued SMD to the matrix setting, it reveals for the first time how the Bregman divergence induced by the mirror map governs the uniqueness of interpolating solutions and the associated inductive bias. The theoretical analysis demonstrates that, under over-parameterization, matrix SMD converges exponentially to the unique solution that both interpolates the data and minimizes the Bregman divergence from the initialization. This result elucidates the pivotal role of the mirror map in shaping the generalization behavior of the learned model.

Technology Category

Application Category

📝 Abstract

We investigate Stochastic Mirror Descent (SMD) with matrix parameters and vector-valued predictions, a framework relevant to multi-class classification and matrix completion problems. Focusing on the overparameterized regime, where the total number of parameters exceeds the number of training samples, we prove that SMD with matrix mirror functions $ψ(\cdot)$ converges exponentially to a global interpolator. Furthermore, we generalize classical implicit bias results of vector SMD by demonstrating that the matrix SMD algorithm converges to the unique solution minimizing the Bregman divergence induced by $ψ(\cdot)$ from initialization subject to interpolating the data. These findings reveal how matrix mirror maps dictate inductive bias in high-dimensional, multi-output problems.

Problem

Research questions and friction points this paper is trying to address.

Implicit Bias

Matrix Stochastic Mirror Descent

Overparameterization

Bregman Divergence

Multi-output Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Matrix Stochastic Mirror Descent

Implicit Bias

Bregman Divergence

Overparameterization

Global Interpolator

🔎 Similar Papers

Implicit Bias and Fast Convergence Rates for Self-attention

2024-02-08arXiv.orgCitations: 9

From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings

2024-02-18arXiv.orgCitations: 0