Assessing Color Vision Test in Large Vision-language Models

📅 2025-07-15

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Large vision-language models (VLMs) lack systematic evaluation of color vision capabilities. Method: We introduce the first fine-grained color vision benchmark for VLMs, comprising a manually annotated, multi-category, multi-difficulty test dataset. We propose a failure-pattern-driven taxonomy for color-related errors and integrate prompt engineering with targeted fine-tuning to systematically analyze model performance in color recognition, discrimination, and cross-modal semantic understanding. Contribution/Results: Experiments uncover critical deficiencies in VLMs—including low hue sensitivity, poor discrimination of chromatically similar colors, and misalignment between visual color perception and linguistic color semantics. Our fine-tuning approach yields an average accuracy improvement of 12.7% across diverse color-centric tasks, demonstrating its efficacy in enhancing color perception and comprehension. This work establishes a novel, reproducible evaluation paradigm and benchmark for assessing fundamental visual capabilities in VLMs.

Technology Category

Application Category

📝 Abstract

With the widespread adoption of large vision-language models, the capacity for color vision in these models is crucial. However, the color vision abilities of large visual-language models have not yet been thoroughly explored. To address this gap, we define a color vision testing task for large vision-language models and construct a dataset footnote{Anonymous Github Showing some of the data https://anonymous.4open.science/r/color-vision-test-dataset-3BCD} that covers multiple categories of test questions and tasks of varying difficulty levels. Furthermore, we analyze the types of errors made by large vision-language models and propose fine-tuning strategies to enhance their performance in color vision tests.

Problem

Research questions and friction points this paper is trying to address.

Evaluating color vision capabilities in vision-language models

Creating a dataset for diverse color vision tests

Improving model performance via error analysis and fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Defined color vision testing task for models

Constructed multi-category difficulty dataset

Proposed fine-tuning strategies for improvement

🔎 Similar Papers

ColorFoil: Investigating Color Blindness in Large Vision and Language Models