🤖 AI Summary
Existing single-view linear probes struggle to model the higher-order interaction structures between rows and columns in model weights, limiting the effectiveness of weight space learning. To address this, this work proposes MVProbe—the first multi-view probing framework tailored for weight representations—which explicitly captures higher-order correlations by fusing first-order signals with an interaction-aware view constructed via Gram matrices. The method introduces learnable probe vectors and incorporates a scaling-law-guided normalization strategy to enable adaptive normalization and fusion of multi-branch features. Evaluated on the Model Jungle benchmark, MVProbe consistently outperforms the current state-of-the-art ProbeX across diverse architectures, including ResNet, SupViT, MAE, DINO, and Stable Diffusion LoRA.
📝 Abstract
The explosive growth of open-source model repositories has created a Model Jungle, where checkpoints are frequently shared without adequate documentation or metadata. While weight-space learning offers a pathway to identify and analyze these models directly from their parameters, processing full-scale weights is computationally prohibitive. Probing-based methods have emerged as a lightweight alternative, extracting permutation-equivariant representations via learnable probe vectors. However, existing probing methods are limited by a single-view design: they capture first-order structures but fail to encode the rich, higher-order correlation patterns inherent in row-column interactions. To bridge this gap, we introduce MVProbe, a multi-perspective probing framework that synthesizes first-order signals with interaction-aware (Gram-based) views. Our approach is theoretically grounded; we analyze the scaling laws of different probing orders to derive a principled standardization and fusion strategy that ensures balanced contributions from all branches. On the Model Jungle benchmark, MVProbe consistently outperforms the state-of-the-art ProbeX across diverse architectures, including discriminative backbones (ResNet, SupViT, MAE, DINO) and large-scale generative LoRA adapters (Stable Diffusion LoRA).