Too Vivid to Be Real? Benchmarking and Calibrating Generative Color Fidelity

πŸ“… 2026-03-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the prevalent issue of oversaturated colors and exaggerated contrast in text-to-image (T2I) generation, which undermines visual realism due to biased evaluation metrics. The study presents the first systematic formulation and quantification of β€œcolor fidelity,” introducing a large-scale ordered dataset, CFD, comprising 1.3 million real and synthetic images. It proposes the Color Fidelity Metric (CFM), a multimodal perception-based encoder that significantly outperforms existing metrics in assessing color authenticity. Furthermore, the authors develop a training-free Color Fidelity Refinement (CFR) method that dynamically modulates spatial-temporal guidance strength to effectively mitigate over-saturation across multiple T2I models. This approach enhances perceptual realism and establishes a closed-loop framework for both evaluating and improving color fidelity in synthetic imagery.

Technology Category

Application Category

πŸ“ Abstract
Recent advances in text-to-image (T2I) generation have greatly improved visual quality, yet producing images that appear visually authentic to real-world photography remains challenging. This is partly due to biases in existing evaluation paradigms: human ratings and preference-trained metrics often favor visually vivid images with exaggerated saturation and contrast, which make generations often too vivid to be real even when prompted for realistic-style images. To address this issue, we present Color Fidelity Dataset (CFD) and Color Fidelity Metric (CFM) for objective evaluation of color fidelity in realistic-style generations. CFD contains over 1.3M real and synthetic images with ordered levels of color realism, while CFM employs a multimodal encoder to learn perceptual color fidelity. In addition, we propose a training-free Color Fidelity Refinement (CFR) that adaptively modulates spatial-temporal guidance scale in generation, thereby enhancing color authenticity. Together, CFD supports CFM for assessment, whose learned attention further guides CFR to refine T2I fidelity, forming a progressive framework for assessing and improving color fidelity in realistic-style T2I generation. The dataset and code are available at https://github.com/ZhengyaoFang/CFM.
Problem

Research questions and friction points this paper is trying to address.

color fidelity
text-to-image generation
visual authenticity
realistic-style images
evaluation bias
Innovation

Methods, ideas, or system contributions that make the work stand out.

Color Fidelity
Text-to-Image Generation
Realistic Image Synthesis
Perceptual Evaluation
Training-Free Refinement
πŸ”Ž Similar Papers
No similar papers found.
Z
Zhengyao Fang
Harbin Institute of Technology, Shenzhen
Z
Zexi Jia
Independent Researcher
Y
Yijia Zhong
College of Computer Science and Artificial Intelligence, Fudan University
P
Pengcheng Luo
Institute for Artificial Intelligence, Peking University
Jinchao Zhang
Jinchao Zhang
WeChat AI - Pattern Recognition Center
Deep LearningNatural Language ProcessingMachine TranslationDialogue System
Guangming Lu
Guangming Lu
Harbin Institute of Technology, Shenzhen
Computer VisionMachine Learning
Jun Yu
Jun Yu
Shenzhen University
Water Splitting CO2 Electroreduction NH3-SCR
W
Wenjie Pei
Harbin Institute of Technology, Shenzhen; Peng Cheng Laboratory