🤖 AI Summary
Evaluating deep generative models (DGMs) has long suffered from an inherent trade-off among fidelity, diversity, and novelty. This paper introduces the first theory-driven evaluation framework grounded in the Law of Total Expectation, pioneering its application to explicitly model the stochasticity of the underlying data distribution—thereby enabling rigorous disentanglement of memorization bias from true generalization capability. Methodologically, we integrate the Maximum Mean Discrepancy (MMD) metric with unsupervised DINOv2 visual features to construct a lightweight, training-free, plug-and-play evaluation pipeline. On multiple standard benchmarks, our approach matches or surpasses state-of-the-art methods in quantitative performance while accelerating inference by 3.2×. Crucially, it delivers significantly enhanced fine-grained diagnostic capability for assessing large-model-generated content quality.
📝 Abstract
Deep generative models (DGMs) have caused a paradigm shift in the field of machine learning, yielding noteworthy advancements in domains such as image synthesis, natural language processing, and other related areas. However, a comprehensive evaluation of these models that accounts for the trichotomy between fidelity, diversity, and novelty in generated samples remains a formidable challenge. A recently introduced solution that has emerged as a promising approach in this regard is the Feature Likelihood Divergence (FLD), a method that offers a theoretically motivated practical tool, yet also exhibits some computational challenges. In this paper, we propose PALATE, a novel enhancement to the evaluation of DGMs that addresses limitations of existing metrics. Our approach is based on a peculiar application of the law of total expectation to random variables representing accessible real data. When combined with the MMD baseline metric and DINOv2 feature extractor, PALATE offers a holistic evaluation framework that matches or surpasses state-of-the-art solutions while providing superior computational efficiency and scalability to large-scale datasets. Through a series of experiments, we demonstrate the effectiveness of the PALATE enhancement, contributing a computationally efficient, holistic evaluation approach that advances the field of DGMs assessment, especially in detecting sample memorization and evaluating generalization capabilities.