🤖 AI Summary
This work addresses the limitation of conventional meta-networks, which rely solely on raw parameters and overlook the intrinsic symmetries of neural architectures, thereby struggling to capture functional equivalence. To overcome this, the authors propose quasi-equivariant meta-networks, introducing a novel paradigm of “quasi-equivariance” that relaxes strict equivariance constraints while preserving functional identity. This approach strategically balances architectural symmetry with model expressivity through group actions and a relaxed equivariance mechanism, making it compatible with a variety of mainstream network architectures. Empirical results demonstrate that quasi-equivariant meta-networks consistently achieve a superior trade-off between symmetry preservation and representational capacity across multiple architectures, significantly outperforming existing strictly equivariant methods.
📝 Abstract
Metanetworks are neural architectures designed to operate directly on pretrained weights to perform downstream tasks. However, the parameter space serves only as a proxy for the underlying function class, and the parameter-function mapping is inherently non-injective: distinct parameter configurations may yield identical input-output behaviors. As a result, metanetworks that rely solely on raw parameters risk overlooking the intrinsic symmetries of the architecture. Reasoning about functional identity is therefore essential for effective metanetwork design, motivating the development of equivariant metanetworks, which incorporate equivariance principles to respect architectural symmetries. Existing approaches, however, typically enforce strict equivariance, which imposes rigid constraints and often leads to sparse and less expressive models. To address this limitation, we introduce the novel concept of quasi-equivariance, which allows metanetworks to move beyond the rigidity of strict equivariance while still preserving functional identity. We lay down a principled basis for this framework and demonstrate its broad applicability across diverse neural architectures, including feedforward, convolutional, and transformer networks. Through empirical evaluation, we show that quasi-equivariant metanetworks achieve good trade-offs between symmetry preservation and representational expressivity. These findings advance the theoretical understanding of weight-space learning and provide a principled foundation for the design of more expressive and functionally robust metanetworks.