Rethinking FID Through the Geometry of the Reference Dataset

📅 2026-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work systematically investigates the discrepancy between Fréchet Inception Distance (FID) and human perception, demonstrating that low FID scores do not necessarily correspond to high-quality generated outputs. The study identifies the geometric structure of the reference dataset—particularly its distribution density and effective rank—as the primary cause of FID’s unreliability. Through empirical analyses involving precision-recall decomposition, multiple feature spaces, and ablation studies on distance metrics, the authors reveal that FID exhibits reasonable behavior on concentrated datasets but becomes misleading on dispersed ones. Experiments across six datasets confirm that the geometric properties of the reference data critically influence the fidelity of distribution-based evaluation metrics. The findings advocate for interpreting such metrics in conjunction with the underlying dataset geometry, offering a more reliable foundation for evaluating generative models.
📝 Abstract
Fréchet Inception Distance (FID) is widely used to evaluate image generators, yet lower FID does not always correspond to better sample quality. We show that this mismatch depends in part on the geometry of the reference dataset. In a controlled study across six datasets, distributional density and effective rank significantly explain how FID changes as sample quality improves. Concentrated datasets tend to yield more favorable FID trends, whereas more dispersed datasets can make FID worsen despite better samples. Attribution to precision and recall and ablations with alternative feature spaces and distances support the same conclusion. These results suggest that distributional metrics should be interpreted together with the geometry of the reference dataset for more reliable benchmarking.
Problem

Research questions and friction points this paper is trying to address.

Fréchet Inception Distance
image generation evaluation
reference dataset geometry
sample quality
distributional metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fréchet Inception Distance
dataset geometry
distributional density
effective rank
image generation evaluation
🔎 Similar Papers
No similar papers found.