Remove Symmetries to Control Model Expressivity

📅 2024-08-28
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Symmetry in loss functions commonly causes model collapse—degeneracy into low-capacity solutions—during deep learning training. Method: We propose SyRe, a generic, model-agnostic desymmetrization framework that eliminates most symmetry-induced low-capacity optima via parameter-space reparameterization, without requiring prior knowledge of symmetry structure. Contribution/Results: We provide the first rigorous theoretical characterization of how two fundamental symmetry classes provably induce collapse. Our analysis further uncovers an intrinsic trade-off between symmetry and model expressivity. Empirically, SyRe significantly improves optimization stability and final performance across diverse architectures—including CNNs, ViTs, and contrastive learners—on classification and representation learning benchmarks. These results establish a strong causal link between desymmetrization and enhanced representational capacity.

Technology Category

Application Category

📝 Abstract
When symmetry is present in the loss function, the model is likely to be trapped in a low-capacity state that is sometimes known as a"collapse". Being trapped in these low-capacity states can be a major obstacle to training across many scenarios where deep learning technology is applied. We first prove two concrete mechanisms through which symmetries lead to reduced capacities and ignored features during training and inference. We then propose a simple and theoretically justified algorithm, syre, to remove almost all symmetry-induced low-capacity states in neural networks. When this type of entrapment is especially a concern, removing symmetries with the proposed method is shown to correlate well with improved optimization or performance. A remarkable merit of the proposed method is that it is model-agnostic and does not require any knowledge of the symmetry.
Problem

Research questions and friction points this paper is trying to address.

Symmetry causes model collapse
Remove symmetries to improve performance
Propose model-agnostic algorithm syre
Innovation

Methods, ideas, or system contributions that make the work stand out.

Remove symmetries in loss function
Propose syre algorithm
Model-agnostic symmetry removal
🔎 Similar Papers
No similar papers found.
Z
Ziyin Liu
Research Laboratory of Electronics, Massachusetts Institute of Technology; Physics & Informatics Laboratories, NTT Research
Yizhou Xu
Yizhou Xu
PhD student, Computer and Communication Sciences, EPFL
Machine learningHigh dimensional statisticsStatistical physics
I
Isaac Chuang
Research Laboratory of Electronics, Massachusetts Institute of Technology; Department of Physics, Massachusetts Institute of Technology