Rethinking Target Label Conditioning in Adversarial Attacks: A 2D Tensor-Guided Generative Approach

📅 2025-04-19

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Existing multi-target adversarial attack methods emphasize theoretical analysis but lack empirical validation, struggling to simultaneously achieve high transferability and interpretability. This work identifies a dual influence mechanism: transferability depends critically on both the *quality* (structural integrity) and *quantity* (spatial sufficiency) of semantic features. To address this, we propose the 2D Tensor-Guided Adversarial Fusion framework (2D-TGAF), built upon diffusion models; it is the first method to enable interpretable encoding of target labels into two-dimensional semantic tensors. We further introduce a semantic-preserving masking strategy to enhance training stability. Evaluated on ImageNet, our approach consistently outperforms state-of-the-art methods across standard models and diverse defenses—including randomized smoothing and feature denoising—achieving significantly higher attack success rates. This work establishes a novel paradigm for multi-target adversarial attacks that jointly ensures strong transferability and human-interpretable semantics.

Technology Category

Application Category

📝 Abstract

Compared to single-target adversarial attacks, multi-target attacks have garnered significant attention due to their ability to generate adversarial images for multiple target classes simultaneously. Existing generative approaches for multi-target attacks mainly analyze the effect of the use of target labels on noise generation from a theoretical perspective, lacking practical validation and comprehensive summarization. To address this gap, we first identify and validate that the semantic feature quality and quantity are critical factors affecting the transferability of targeted attacks: 1) Feature quality refers to the structural and detailed completeness of the implanted target features, as deficiencies may result in the loss of key discriminative information; 2) Feature quantity refers to the spatial sufficiency of the implanted target features, as inadequacy limits the victim model's attention to this feature. Based on these findings, we propose the 2D Tensor-Guided Adversarial Fusion (2D-TGAF) framework, which leverages the powerful generative capabilities of diffusion models to encode target labels into two-dimensional semantic tensors for guiding adversarial noise generation. Additionally, we design a novel masking strategy tailored for the training process, ensuring that parts of the generated noise retain complete semantic information about the target class. Extensive experiments on the standard ImageNet dataset demonstrate that 2D-TGAF consistently surpasses state-of-the-art methods in attack success rates, both on normally trained models and across various defense mechanisms.

Problem

Research questions and friction points this paper is trying to address.

Improving multi-target adversarial attack transferability via feature quality

Enhancing adversarial noise generation using 2D semantic tensor guidance

Addressing limitations in existing generative multi-target attack approaches

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 2D tensor-guided adversarial fusion framework

Leverages diffusion models for semantic encoding

Implements novel masking strategy for training

🔎 Similar Papers

Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks