Challenges in Understanding Modality Conflict in Vision-Language Models

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the mechanistic disentanglement of multimodal conflicts in vision-language models (VLMs), specifically aiming to isolate and characterize conflict detection from conflict resolution. We introduce a mechanistic attribution framework grounded in linear probing—used to verify the decodability of conflict signals—and grouped attention pattern analysis, applied to LLaVA-OV-7B. Our key empirical finding is the first demonstration that conflict detection signals are linearly separable in intermediate layers; moreover, detection and resolution exhibit distinct, layer-wise attention patterns—detection dominates earlier layers, while resolution concentrates in later ones—confirming functional separation along the computational pathway. These results reveal a staged processing mechanism for multimodal conflict handling in VLMs, significantly enhancing model interpretability and enabling targeted, intervention-aware control. The findings establish a novel paradigm for conflict-aware VLM architecture design and debugging.

Technology Category

Application Category

📝 Abstract
This paper highlights the challenge of decomposing conflict detection from conflict resolution in Vision-Language Models (VLMs) and presents potential approaches, including using a supervised metric via linear probes and group-based attention pattern analysis. We conduct a mechanistic investigation of LLaVA-OV-7B, a state-of-the-art VLM that exhibits diverse resolution behaviors when faced with conflicting multimodal inputs. Our results show that a linearly decodable conflict signal emerges in the model's intermediate layers and that attention patterns associated with conflict detection and resolution diverge at different stages of the network. These findings support the hypothesis that detection and resolution are functionally distinct mechanisms. We discuss how such decomposition enables more actionable interpretability and targeted interventions for improving model robustness in challenging multimodal settings.
Problem

Research questions and friction points this paper is trying to address.

Decomposing conflict detection from resolution in VLMs
Investigating mechanisms in LLaVA-OV-7B for multimodal conflicts
Enabling interpretability and interventions for model robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Linear probes for conflict detection
Group-based attention pattern analysis
Mechanistic investigation of LLaVA-OV-7B
Trang Nguyen
Trang Nguyen
Technical Staff, MIT Lincoln Laboratory
Natural Language ProcessingLarge Language ModelsExplainable AICyber Analytics
J
Jackson Michaels
Manning College of Information & Computer Sciences, University of Massachusetts Amherst, Amherst, U.S.
M
Madalina Fiterau
Manning College of Information & Computer Sciences, University of Massachusetts Amherst, Amherst, U.S.
David Jensen
David Jensen
Professor of Computer Science, University of Massachusetts Amherst
Machine LearningCausationCausal DiscoveryStatistical Relational LearningComputational Social Science