Black-box model classification under the discriminative factorization

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

224K/year
🤖 AI Summary
In black-box model settings, where users can only interact with a system via API queries, existing low-dimensional representations based on response embeddings exhibit high sensitivity to query set quality, leading to unstable model-level classification performance. This work proposes a discriminative decomposition framework that, for the first time, establishes a theoretical connection between query set quality and the rate of performance degradation. By integrating relational modeling of response embeddings, discriminative factorization, and probabilistic analysis, the method effectively predicts empirical performance and optimizes query set selection. Experiments across three auditing tasks demonstrate that the estimated decomposition parameters accurately capture performance decay trends, and query sets selected via the discriminative field successfully reproduce the empirical ranking of ideal query sets.
📝 Abstract
Access to modern generative systems is often restricted to querying an API (the ``black-box" setting) and many properties of the system are unknown to the user at inference time. While recent work has shown that low-dimensional representations of models based on the relationship between their embedded responses to a set of queries are useful for inferring model-level properties, the quality of these representations is highly sensitive to the query set. We introduce the \emph{discriminative factorization} to distinguish between high- and low-quality query sets in the context of black-box model-level classification. Under this framework, the probability of chance-level classification decays exponentially in the query budget. On three auditing tasks, estimated factorization parameters predict the empirical performance decay rate. We conclude by showing that query sets selected using the estimated discriminative field reproduce the empirical ordering of oracle query sets.
Problem

Research questions and friction points this paper is trying to address.

black-box model classification
query set quality
discriminative factorization
model-level properties
API-based inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

discriminative factorization
black-box model classification
query set selection
model auditing
representation learning
🔎 Similar Papers
No similar papers found.