🤖 AI Summary
Existing model repositories lack effective retrieval mechanisms, relying solely on text-based search, which impedes concept-level matching (e.g., “dog”) due to semantic misalignment between natural language queries and model internals. Method: This paper proposes a zero-shot classification model retrieval method that requires neither model metadata nor training data. It extracts semantic descriptors from the logit layer of each model using fixed probe inputs, enabling direct alignment of human-understandable concepts to logit-space representations and similarity measurement therein. Contribution/Results: We introduce a novel logit-level probing representation paradigm and a collaborative filtering acceleration strategy, reducing large-scale model encoding overhead by 3×. Our approach achieves high retrieval accuracy on both real-world model repositories and fine-grained downstream tasks, and scales effectively to comprehensive public model warehouses.
📝 Abstract
With the increasing numbers of publicly available models, there are probably pretrained, online models for most tasks users require. However, current model search methods are rudimentary, essentially a text-based search in the documentation, thus users cannot find the relevant models. This paper presents ProbeLog, a method for retrieving classification models that can recognize a target concept, such as"Dog", without access to model metadata or training data. Differently from previous probing methods, ProbeLog computes a descriptor for each output dimension (logit) of each model, by observing its responses on a fixed set of inputs (probes). Our method supports both logit-based retrieval ("find more logits like this") and zero-shot, text-based retrieval ("find all logits corresponding to dogs"). As probing-based representations require multiple costly feedforward passes through the model, we develop a method, based on collaborative filtering, that reduces the cost of encoding repositories by 3x. We demonstrate that ProbeLog achieves high retrieval accuracy, both in real-world and fine-grained search tasks and is scalable to full-size repositories.