Implicit bias as a Gauge correction: Theory and Inverse Design

📅 2026-01-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the formation mechanism of implicit bias in machine learning optimization—specifically, how optimization algorithms prefer certain solutions among multiple feasible ones. By integrating the continuous symmetries of model parameterizations with the stochasticity inherent in optimization, the study provides the first unified explanation of implicit bias as a geometric correction induced by learning dynamics in the associated quotient space. Leveraging tools from differential geometry, stochastic differential equations, and Lie group theory, the authors develop a general framework that enables both forward prediction and inverse design of implicit biases. The approach accurately predicts and precisely controls diverse forms of implicit bias—such as sparsity and spectral properties—across various architectures, with numerical experiments showing excellent agreement with theoretical predictions.

Technology Category

Application Category

📝 Abstract
A central problem in machine learning theory is to characterize how learning dynamics select particular solutions among the many compatible with the training objective, a phenomenon, called implicit bias, which remains only partially characterized. In the present work, we identify a general mechanism, in terms of an explicit geometric correction of the learning dynamics, for the emergence of implicit biases, arising from the interaction between continuous symmetries in the model's parametrization and stochasticity in the optimization process. Our viewpoint is constructive in two complementary directions: given model symmetries, one can derive the implicit bias they induce; conversely, one can inverse-design a wide class of different implicit biases by computing specific redundant parameterizations. More precisely, we show that, when the dynamics is expressed in the quotient space obtained by factoring out the symmetry group of the parameterization, the resulting stochastic differential equation gains a closed form geometric correction in the stationary distribution of the optimizer dynamics favoring orbits with small local volume. We compute the resulting symmetry induced bias for a range of architectures, showing how several well known results fit into a single unified framework. The approach also provides a practical methodology for deriving implicit biases in new settings, and it yields concrete, testable predictions that we confirm by numerical simulations on toy models trained on synthetic data, leaving more complex scenarios for future work. Finally, we test the implicit bias inverse-design procedure in notable cases, including biases toward sparsity in linear features or in spectral properties of the model parameters.
Problem

Research questions and friction points this paper is trying to address.

implicit bias
learning dynamics
model symmetries
stochastic optimization
solution selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

implicit bias
gauge symmetry
geometric correction
inverse design
stochastic optimization
🔎 Similar Papers
No similar papers found.
N
Nicola Aladrah
Università di Trieste, Dipartimento di Matematica, Informatica e Geoscienze, Trieste, Italy
E
Emanuele Ballarin
Università di Trieste, Dipartimento di Matematica, Informatica e Geoscienze, Trieste, Italy
M
Matteo Biagetti
Area Science Park, Trieste, Italy
Alessio Ansuini
Alessio Ansuini
AREA Science Park, Research and Technology Institute
Computational NeuroscienceMachine LearningArtificial IntelligenceNeurobiologyPhysics
Alberto d'Onofrio
Alberto d'Onofrio
Associate Professor, Computer Sciences for Complex Systems Lab, Università di Trieste
Theoretical BiophysicsBiomathematicsNonlinear PhysicsMathematical Epidemiology
F
Fabio Anselmi
Università di Trieste, Dipartimento di Matematica, Informatica e Geoscienze, Trieste, Italy