LAMP: Learning Universal Adversarial Perturbations for Multi-Image Tasks via Pre-trained Models

πŸ“… 2026-01-29
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitations of existing adversarial attack methods, which are primarily designed for single-image inputs under white-box settings and thus struggle to generalize to black-box scenarios involving multi-image multimodal large language models (MLLMs). To bridge this gap, we propose LAMPβ€”the first black-box universal adversarial perturbation method tailored for multi-image MLLMs. LAMP introduces a cross-image contagion mechanism and a position-invariant index-attention suppression loss, enabling effective attacks by perturbing only a subset of input images. By leveraging attention constraints, our approach enhances both the transferability and imperceptibility of the generated perturbations. Extensive experiments demonstrate that LAMP significantly outperforms state-of-the-art methods across multiple vision-language tasks and mainstream MLLMs, achieving substantially higher attack success rates.

Technology Category

Application Category

πŸ“ Abstract
Multimodal Large Language Models (MLLMs) have achieved remarkable performance across vision-language tasks. Recent advancements allow these models to process multiple images as inputs. However, the vulnerabilities of multi-image MLLMs remain unexplored. Existing adversarial attacks focus on single-image settings and often assume a white-box threat model, which is impractical in many real-world scenarios. This paper introduces LAMP, a black-box method for learning Universal Adversarial Perturbations (UAPs) targeting multi-image MLLMs. LAMP applies an attention-based constraint that prevents the model from effectively aggregating information across images. LAMP also introduces a novel cross-image contagious constraint that forces perturbed tokens to influence clean tokens, spreading adversarial effects without requiring all inputs to be modified. Additionally, an index-attention suppression loss enables a robust position-invariant attack. Experimental results show that LAMP outperforms SOTA baselines and achieves the highest attack success rates across multiple vision-language tasks and models.
Problem

Research questions and friction points this paper is trying to address.

multi-image MLLMs
adversarial attacks
black-box
Universal Adversarial Perturbations
vision-language tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Universal Adversarial Perturbations
Multi-Image MLLMs
Black-Box Attack
Cross-Image Contagious Constraint
Attention-Based Constraint
πŸ”Ž Similar Papers
No similar papers found.