Too Vivid to Be Real? Benchmarking and Calibrating Generative Color Fidelity

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the prevalent issue of oversaturated colors and exaggerated contrast in text-to-image (T2I) generation, which undermines visual realism due to biased evaluation metrics. The study presents the first systematic formulation and quantification of “color fidelity,” introducing a large-scale ordered dataset, CFD, comprising 1.3 million real and synthetic images. It proposes the Color Fidelity Metric (CFM), a multimodal perception-based encoder that significantly outperforms existing metrics in assessing color authenticity. Furthermore, the authors develop a training-free Color Fidelity Refinement (CFR) method that dynamically modulates spatial-temporal guidance strength to effectively mitigate over-saturation across multiple T2I models. This approach enhances perceptual realism and establishes a closed-loop framework for both evaluating and improving color fidelity in synthetic imagery.

Technology Category

Application Category

📝 Abstract

Recent advances in text-to-image (T2I) generation have greatly improved visual quality, yet producing images that appear visually authentic to real-world photography remains challenging. This is partly due to biases in existing evaluation paradigms: human ratings and preference-trained metrics often favor visually vivid images with exaggerated saturation and contrast, which make generations often too vivid to be real even when prompted for realistic-style images. To address this issue, we present Color Fidelity Dataset (CFD) and Color Fidelity Metric (CFM) for objective evaluation of color fidelity in realistic-style generations. CFD contains over 1.3M real and synthetic images with ordered levels of color realism, while CFM employs a multimodal encoder to learn perceptual color fidelity. In addition, we propose a training-free Color Fidelity Refinement (CFR) that adaptively modulates spatial-temporal guidance scale in generation, thereby enhancing color authenticity. Together, CFD supports CFM for assessment, whose learned attention further guides CFR to refine T2I fidelity, forming a progressive framework for assessing and improving color fidelity in realistic-style T2I generation. The dataset and code are available at https://github.com/ZhengyaoFang/CFM.

Problem

Research questions and friction points this paper is trying to address.

color fidelity

text-to-image generation

visual authenticity

realistic-style images

evaluation bias

Innovation

Methods, ideas, or system contributions that make the work stand out.

Color Fidelity

Text-to-Image Generation

Realistic Image Synthesis