Contamination Detection for VLMs using Multi-Modal Semantic Perturbation

📅 2025-11-05

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Visual language models (VLMs) pretrained on internet-scale proprietary data often exhibit inflated performance due to test-set leakage, yet existing leakage detection methods fail on such models. This paper first systematically exposes the fundamental limitations of prevailing VLM contamination detection approaches. We propose a novel, multimodal semantic perturbation–based detection framework that constructs adversarial test environments by jointly perturbing the semantic spaces of images and texts—e.g., via attribute substitution or relational inversion—to expose model reliance on leaked data. Our method is robust across diverse contamination strategies, highly interpretable, and requires no access to training data or model gradients. Extensive experiments across realistic contamination scenarios demonstrate consistent superiority over baselines, achieving an average 23.6% improvement in detection accuracy. The code and perturbed benchmark dataset will be publicly released.

Technology Category

Application Category

📝 Abstract

Recent advances in Vision-Language Models (VLMs) have achieved state-of-the-art performance on numerous benchmark tasks. However, the use of internet-scale, often proprietary, pretraining corpora raises a critical concern for both practitioners and users: inflated performance due to test-set leakage. While prior works have proposed mitigation strategies such as decontamination of pretraining data and benchmark redesign for LLMs, the complementary direction of developing detection methods for contaminated VLMs remains underexplored. To address this gap, we deliberately contaminate open-source VLMs on popular benchmarks and show that existing detection approaches either fail outright or exhibit inconsistent behavior. We then propose a novel simple yet effective detection method based on multi-modal semantic perturbation, demonstrating that contaminated models fail to generalize under controlled perturbations. Finally, we validate our approach across multiple realistic contamination strategies, confirming its robustness and effectiveness. The code and perturbed dataset will be released publicly.

Problem

Research questions and friction points this paper is trying to address.

Detecting test-set contamination in vision-language models

Evaluating existing detection methods' failure on contaminated VLMs

Developing robust contamination detection using multi-modal perturbations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal semantic perturbation for contamination detection

Testing model generalization under controlled perturbations

Validating robustness across realistic contamination strategies

🔎 Similar Papers

A Comprehensive Survey of Contamination Detection Methods in Large Language Models