PA-Attack: Guiding Gray-Box Attacks on LVLM Vision Encoders with Prototypes and Attention

📅 2026-02-22

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This work proposes PA-Attack, an efficient gray-box attack method tailored for large vision-language models (LVLMs) that share a visual encoder, addressing the poor generalization of white-box attacks and the low efficiency of black-box attacks. PA-Attack introduces a novel universal prototype anchoring mechanism to stabilize the attack direction and incorporates a two-stage attention enhancement strategy: first identifying semantically critical visual tokens via token-level attention scoring, then adaptively recalibrating attention weights to precisely concentrate perturbations on sensitivity-prone regions. Extensive experiments demonstrate that PA-Attack achieves an average score reduction rate (SRR) of 75.1% across diverse LVLM architectures and downstream tasks, significantly improving the generalization, efficiency, and applicability of gray-box attacks.

Technology Category

Application Category

📝 Abstract

Large Vision-Language Models (LVLMs) are foundational to modern multimodal applications, yet their susceptibility to adversarial attacks remains a critical concern. Prior white-box attacks rarely generalize across tasks, and black-box methods depend on expensive transfer, which limits efficiency. The vision encoder, standardized and often shared across LVLMs, provides a stable gray-box pivot with strong cross-model transfer. Building on this premise, we introduce PA-Attack (Prototype-Anchored Attentive Attack). PA-Attack begins with a prototype-anchored guidance that provides a stable attack direction towards a general and dissimilar prototype, tackling the attribute-restricted issue and limited task generalization of vanilla attacks. Building on this, we propose a two-stage attention enhancement mechanism: (i) leverage token-level attention scores to concentrate perturbations on critical visual tokens, and (ii) adaptively recalibrate attention weights to track the evolving attention during the adversarial process. Extensive experiments across diverse downstream tasks and LVLM architectures show that PA-Attack achieves an average 75.1% score reduction rate (SRR), demonstrating strong attack effectiveness, efficiency, and task generalization in LVLMs. Code is available at https://github.com/hefeimei06/PA-Attack.

Problem

Research questions and friction points this paper is trying to address.

adversarial attacks

Large Vision-Language Models

task generalization

gray-box attacks

vision encoder

Innovation

Methods, ideas, or system contributions that make the work stand out.

prototype-anchored attack

attention mechanism

gray-box attack