Protecting multimodal large language models against misleading visualizations

📅 2025-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study exposes the significant vulnerability of multimodal large language models (MLLMs) to misleading visualizations—such as truncated or inverted axes—that induce erroneous inferences and facilitate misinformation propagation. To address this, we propose six inference-time defense strategies and introduce a novel “chart data parsing + pure-text LLM collaborative reasoning” paradigm: OCR and structured table extraction decouple misleading pattern detection from answer generation, jointly ensuring both safety and fidelity. Our approach preserves original chart question-answering accuracy while boosting accuracy on misleading charts by 15.4–19.6 percentage points—consistently surpassing random baselines. This work establishes a new benchmark for evaluating MLLM robustness in visualization understanding and delivers a scalable, principled defense framework for trustworthy multimodal reasoning.

Technology Category

Application Category

📝 Abstract
We assess the vulnerability of multimodal large language models to misleading visualizations - charts that distort the underlying data using techniques such as truncated or inverted axes, leading readers to draw inaccurate conclusions that may support misinformation or conspiracy theories. Our analysis shows that these distortions severely harm multimodal large language models, reducing their question-answering accuracy to the level of the random baseline. To mitigate this vulnerability, we introduce six inference-time methods to improve performance of MLLMs on misleading visualizations while preserving their accuracy on non-misleading ones. The most effective approach involves (1) extracting the underlying data table and (2) using a text-only large language model to answer questions based on the table. This method improves performance on misleading visualizations by 15.4 to 19.6 percentage points.
Problem

Research questions and friction points this paper is trying to address.

Assessing vulnerability of MLLMs to misleading visualizations
Reducing accuracy drop in MLLMs due to distorted charts
Introducing methods to improve MLLM performance on misleading data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts underlying data tables from visualizations
Uses text-only LLM for accurate question answering
Improves accuracy on misleading charts significantly
🔎 Similar Papers
No similar papers found.