Multimodal Backdoor Attack on VLMs for Autonomous Driving via Graffiti and Cross-Lingual Triggers

📅 2026-04-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing backdoor attacks on vision-language models predominantly rely on explicit, unimodal triggers, which struggle to achieve stealth and stability in real-world applications such as autonomous driving. This work proposes GLA, a novel approach that, for the first time, leverages graffiti generated via Stable Diffusion–based image inpainting as natural visual triggers, combined with cross-lingual textual perturbations to form semantically coherent multimodal backdoors. Requiring only a 10% poisoning ratio, GLA achieves a 90% attack success rate with 0% false positive rate, while preserving—and even slightly improving—clean-sample performance on metrics such as BLEU-1. This strategy substantially enhances the attack’s stealthiness and robustness, effectively evading current detection mechanisms without compromising model utility on benign inputs.
📝 Abstract
Visual language model (VLM) is rapidly being integrated into safety-critical systems such as autonomous driving, making it an important attack surface for potential backdoor attacks. Existing backdoor attacks mainly rely on unimodal, explicit, and easily detectable triggers, making it difficult to construct both covert and stable attack channels in autonomous driving scenarios. GLA introduces two naturalistic triggers: graffiti-based visual patterns generated via stable diffusion inpainting, which seamlessly blend into urban scenes, and cross-language text triggers, which introduce distributional shifts while maintaining semantic consistency to build robust language-side trigger signals. Experiments on DriveVLM show that GLA requires only a 10\% poisoning ratio to achieve a 90\% Attack Success Rate (ASR) and a 0\% False Positive Rate (FPR). More insidiously, the backdoor does not weaken the model on clean tasks, but instead improves metrics such as BLEU-1, making it difficult for traditional performance-degradation-based detection methods to identify the attack. This study reveals underestimated security threats in self-driving VLMs and provides a new attack paradigm for backdoor evaluation in safety-critical multimodal systems.
Problem

Research questions and friction points this paper is trying to address.

Multimodal Backdoor Attack
Visual Language Models
Autonomous Driving
Covert Triggers
Safety-Critical Systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal backdoor attack
visual language models
graffiti trigger
cross-lingual trigger
autonomous driving security
🔎 Similar Papers
No similar papers found.
J
Jiancheng Wang
L
Lidan Liang
Y
Yong Wang
Z
Zengzhen Su
Haifeng Xia
Haifeng Xia
Tulane University
Machine Learning
Y
Yuanting Yan
W
Wei Wang