Identifiable Equivariant Networks are Layerwise Equivariant

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the relationship between end-to-end equivariance and layer-wise equivariance in deep neural networks, offering a theoretical explanation for the empirically observed phenomenon wherein weights spontaneously develop equivariant structures during training. Under a parameter identifiability assumption, the work establishes—for the first time—a rigorous proof that if a network as a whole is equivariant under a group action, then there exists a parameter configuration under which each layer is equivariant in its latent space. Leveraging tools from group actions, parameter identifiability, and abstract algebra, the authors construct an architecture-agnostic theoretical framework applicable to a broad class of identifiable networks. This framework provides a solid mathematical foundation for the emergence of equivariant structures in practice and reveals the intrinsic mechanism by which equivariance naturally arises during training.

Technology Category

Application Category

📝 Abstract
We investigate the relation between end-to-end equivariance and layerwise equivariance in deep neural networks. We prove the following: For a network whose end-to-end function is equivariant with respect to group actions on the input and output spaces, there is a parameter choice yielding the same end-to-end function such that its layers are equivariant with respect to some group actions on the latent spaces. Our result assumes that the parameters of the model are identifiable in an appropriate sense. This identifiability property has been established in the literature for a large class of networks, to which our results apply immediately, while it is conjectural for others. The theory we develop is grounded in an abstract formalism, and is therefore architecture-agnostic. Overall, our results provide a mathematical explanation for the emergence of equivariant structures in the weights of neural networks during training -- a phenomenon that is consistently observed in practice.
Problem

Research questions and friction points this paper is trying to address.

equivariance
neural networks
group actions
identifiability
layerwise structure
Innovation

Methods, ideas, or system contributions that make the work stand out.

equivariant networks
layerwise equivariance
identifiability
group actions
architecture-agnostic theory
🔎 Similar Papers
V
Vahid Shahverdi
Department of Mathematics, KTH Royal Institute of Technology, Stockholm, Sweden
G
G. Marchetti
Department of Mathematics, KTH Royal Institute of Technology, Stockholm, Sweden
G
Georg Bokman
University of Amsterdam, The Netherlands
Kathlén Kohn
Kathlén Kohn
Associate Professor at KTH
algebraic geometrymachine learningcomputer visionstatistics