π€ AI Summary
Existing neural network weight representation methods are constrained by architecture and scale, limiting generalization across heterogeneous architectures and datasets. This paper proposes the SNE encoderβthe first approach to produce unified, set-level representations of neural networks regardless of architecture or parameter count, enabling cross-architecture and cross-dataset network property prediction. Our method introduces three key innovations: (1) a Logit Invariance constraint that jointly models computational hierarchy and weight-space symmetry; (2) a tunable, hierarchical encoding pipeline comprising padding, chunking, and encoding stages; and (3) formal definition and solution of the novel task of cross-dataset/cross-architecture prediction. Evaluated on standard benchmarks, SNE significantly outperforms existing baselines, demonstrating strong generalization capability and explicit architecture independence.
π Abstract
We propose a neural network weight encoding method for network property prediction that utilizes set-to-set and set-to-vector functions to efficiently encode neural network parameters. Our approach is capable of encoding neural networks in a model zoo of mixed architecture and different parameter sizes as opposed to previous approaches that require custom encoding models for different architectures. Furthermore, our extbf{S}et-based extbf{N}eural network extbf{E}ncoder (SNE) takes into consideration the hierarchical computational structure of neural networks. To respect symmetries inherent in network weight space, we utilize Logit Invariance to learn the required minimal invariance properties. Additionally, we introduce a extit{pad-chunk-encode} pipeline to efficiently encode neural network layers that is adjustable to computational and memory constraints. We also introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture. In cross-dataset property prediction, we evaluate how well property predictors generalize across model zoos trained on different datasets but of the same architecture. In cross-architecture property prediction, we evaluate how well property predictors transfer to model zoos of different architecture not seen during training. We show that SNE outperforms the relevant baselines on standard benchmarks.