Fighting Fire with Fire (F3): A Training-free and Efficient Visual Adversarial Example Purification Method in LVLMs

📅 2025-06-01

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Large Vision-Language Models (LVLMs) are vulnerable to visual adversarial attacks, yet existing defense methods are scarce and typically require model retraining. This paper proposes F3, a training-free, zero-parameter-update purification method that efficiently neutralizes adversarial perturbations by injecting controllable noise and leveraging cross-modal attention for self-correction—introducing the novel “fight-fire-with-fire” paradigm. F3 comprises three stages: (1) stochastic noise injection to generate reference attention maps, (2) cross-modal attention distillation to extract robust multimodal alignments, and (3) noise-driven attention recalibration to suppress attack-induced distortions. Evaluated across multiple LVLM architectures and mainstream adversarial benchmarks, F3 consistently outperforms prior purification methods in robustness and fidelity, achieves over 3× faster inference, and strictly preserves original task performance without degradation.

Technology Category

Application Category

📝 Abstract

Recent advances in large vision-language models (LVLMs) have showcased their remarkable capabilities across a wide range of multimodal vision-language tasks. However, these models remain vulnerable to visual adversarial attacks, which can substantially compromise their performance. Despite their potential impact, the development of effective methods for purifying such adversarial examples has received relatively limited attention. In this paper, we introduce F3, a novel adversarial purification framework that employs a counterintuitive"fighting fire with fire"strategy: intentionally introducing simple perturbations to adversarial examples to mitigate their harmful effects. Specifically, F3 leverages cross-modal attentions derived from randomly perturbed adversary examples as reference targets. By injecting noise into these adversarial examples, F3 effectively refines their attention, resulting in cleaner and more reliable model outputs. Remarkably, this seemingly paradoxical approach of employing noise to counteract adversarial attacks yields impressive purification results. Furthermore, F3 offers several distinct advantages: it is training-free and straightforward to implement, and exhibits significant computational efficiency improvements compared to existing purification methods. These attributes render F3 particularly suitable for large-scale industrial applications where both robust performance and operational efficiency are critical priorities. The code will be made publicly available.

Problem

Research questions and friction points this paper is trying to address.

Purifying visual adversarial examples in LVLMs

Mitigating harmful effects using intentional perturbations

Training-free and efficient adversarial purification method

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses noise injection to purify adversarial examples

Leverages cross-modal attentions from perturbed examples

Training-free and computationally efficient method

🔎 Similar Papers

ZeroPur: Succinct Training-Free Adversarial Purification