🤖 AI Summary
Systematic comparison of generative models—such as large language models and text-to-image diffusion models—becomes increasingly challenging as model scale expands, due to heterogeneous output spaces and lack of cross-model behavioral alignment.
Method: This paper proposes a query-set-based embedding framework that maps outputs of diverse models on identical queries into a shared reproducing kernel Hilbert space (RKHS), enabling comparable behavioral modeling across models.
Contribution/Results: We establish, for the first time, a consistency theory for generative model embeddings in data-induced kernel spaces, providing sufficient conditions for consistency when both query-set size and model count grow asymptotically. Leveraging kernel methods, embedding learning, and asymptotic statistical analysis, we achieve unbiased and consistent estimation of model representations. Our framework furnishes a verifiable theoretical foundation and quantitative tools for performance evaluation, bias diagnosis, and alignment optimization of generative models.
📝 Abstract
Generative models, such as large language models and text-to-image diffusion models, produce relevant information when presented a query. Different models may produce different information when presented the same query. As the landscape of generative models evolves, it is important to develop techniques to study and analyze differences in model behaviour. In this paper we present novel theoretical results for embedding-based representations of generative models in the context of a set of queries. In particular, we establish sufficient conditions for the consistent estimation of the model embeddings in situations where the query set and the number of models grow.