A New Perspective on Precision and Recall for Generative Models

📅 2025-11-04

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Evaluating generative models via scalar metrics fails to characterize the trade-off between sample fidelity and diversity, particularly for long-range dependencies. To address the lack of theoretical foundations and a unified framework for precision–recall (PR) curve estimation, this work introduces a novel paradigm grounded in binary classification theory. Our framework derives a minimax upper bound on the PR estimation risk, unifying and generalizing classical metrics—including FID, precision, and recall—within a coherent statistical framework. Leveraging nonparametric estimation and rigorous statistical risk analysis, we establish the statistical optimality of our estimators. Extensive experiments across diverse generative modeling scenarios reveal heterogeneous PR curve behaviors, demonstrating significant improvements in evaluation granularity, interpretability, and reliability.

Technology Category

Application Category

📝 Abstract

With the recent success of generative models in image and text, the question of their evaluation has recently gained a lot of attention. While most methods from the state of the art rely on scalar metrics, the introduction of Precision and Recall (PR) for generative model has opened up a new avenue of research. The associated PR curve allows for a richer analysis, but their estimation poses several challenges. In this paper, we present a new framework for estimating entire PR curves based on a binary classification standpoint. We conduct a thorough statistical analysis of the proposed estimates. As a byproduct, we obtain a minimax upper bound on the PR estimation risk. We also show that our framework extends several landmark PR metrics of the literature which by design are restrained to the extreme values of the curve. Finally, we study the different behaviors of the curves obtained experimentally in various settings.

Problem

Research questions and friction points this paper is trying to address.

Estimating entire Precision-Recall curves for generative models

Addressing statistical challenges in PR curve evaluation

Extending existing PR metrics beyond extreme curve values

Innovation

Methods, ideas, or system contributions that make the work stand out.

Binary classification framework for PR curves

Statistical analysis of proposed estimation methods

Extends existing metrics to full curve range

🔎 Similar Papers

No similar papers found.