GMValuator: Similarity-based Data Valuation for Generative Models

๐Ÿ“… 2023-04-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing data valuation methods primarily target discriminative models and lack adaptability to generative models; moreover, current generative valuation approaches often rely on specific architectures and suffer from limited robustness and efficiency. This paper proposes GMValuatorโ€”the first training-free, model-agnostic framework for generative data valuation. It leverages fine-grained similarity matching between generated samples and training data, incorporates an image-quality-aware bias calibration mechanism, and establishes a four-dimensional interpretable evaluation criterion: reasonableness, fidelity, diversity, and consistency. The method integrates no-reference image quality assessment (NR-IQA), cross-domain nearest-neighbor retrieval, and contribution attribution propagation. Extensive experiments on StyleGAN2, DDPM, and benchmarks including FFHQ and CIFAR-10 demonstrate that GMValuator achieves significantly higher valuation accuracy than state-of-the-art baselines, with over 5ร— improvement in computational efficiency.
๐Ÿ“ Abstract
Data valuation plays a crucial role in machine learning. Existing data valuation methods have primarily focused on discriminative models, neglecting generative models that have recently gained considerable attention. A very few existing attempts of data valuation method designed for deep generative models either concentrates on specific models or lacks robustness in their outcomes. Moreover, efficiency still reveals vulnerable shortcomings. To bridge the gaps, we formulate the data valuation problem in generative models from a similarity-matching perspective. Specifically, we introduce Generative Model Valuator (GMValuator), the first training-free and model-agnostic approach to provide data valuation for generation tasks. It empowers efficient data valuation through our innovatively similarity matching module, calibrates biased contribution by incorporating image quality assessment, and attributes credits to all training samples based on their contributions to the generated samples. Additionally, we introduce four evaluation criteria for assessing data valuation methods in generative models, aligning with principles of plausibility and truthfulness. GMValuator is extensively evaluated on various datasets and generative architectures to demonstrate its effectiveness.
Problem

Research questions and friction points this paper is trying to address.

Addresses data valuation for generative models
Introduces training-free model-agnostic valuation approach
Improves efficiency and robustness in valuation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free data valuation
Model-agnostic similarity matching
Image quality assessment calibration
๐Ÿ”Ž Similar Papers
No similar papers found.