Assessing Color Vision Test in Large Vision-language Models

๐Ÿ“… 2025-07-15
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Large vision-language models (VLMs) lack systematic evaluation of color vision capabilities. Method: We introduce the first fine-grained color vision benchmark for VLMs, comprising a manually annotated, multi-category, multi-difficulty test dataset. We propose a failure-pattern-driven taxonomy for color-related errors and integrate prompt engineering with targeted fine-tuning to systematically analyze model performance in color recognition, discrimination, and cross-modal semantic understanding. Contribution/Results: Experiments uncover critical deficiencies in VLMsโ€”including low hue sensitivity, poor discrimination of chromatically similar colors, and misalignment between visual color perception and linguistic color semantics. Our fine-tuning approach yields an average accuracy improvement of 12.7% across diverse color-centric tasks, demonstrating its efficacy in enhancing color perception and comprehension. This work establishes a novel, reproducible evaluation paradigm and benchmark for assessing fundamental visual capabilities in VLMs.

Technology Category

Application Category

๐Ÿ“ Abstract
With the widespread adoption of large vision-language models, the capacity for color vision in these models is crucial. However, the color vision abilities of large visual-language models have not yet been thoroughly explored. To address this gap, we define a color vision testing task for large vision-language models and construct a dataset footnote{Anonymous Github Showing some of the data https://anonymous.4open.science/r/color-vision-test-dataset-3BCD} that covers multiple categories of test questions and tasks of varying difficulty levels. Furthermore, we analyze the types of errors made by large vision-language models and propose fine-tuning strategies to enhance their performance in color vision tests.
Problem

Research questions and friction points this paper is trying to address.

Evaluating color vision capabilities in vision-language models
Creating a dataset for diverse color vision tests
Improving model performance via error analysis and fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Defined color vision testing task for models
Constructed multi-category difficulty dataset
Proposed fine-tuning strategies for improvement
๐Ÿ”Ž Similar Papers
No similar papers found.
H
Hongfei Ye
University of Chinese Academy of Sciences
B
Bin Chen
University of Chinese Academy of Sciences
Wenxi Liu
Wenxi Liu
Fuzhou University
Computer vision
Y
Yu Zhang
University of Chinese Academy of Sciences
Z
Zhao Li
Zhejiang Lab
D
Dandan Ni
Zhejiang University
Hongyang Chen
Hongyang Chen
SUN YAT-SEN UNIVERSITY
SDNCloud ComputingMicroserviceAIOps