Latent Danger Zone: Distilling Unified Attention for Cross-Architecture Black-box Attacks

📅 2025-09-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Poor cross-architecture transferability and high query cost hinder practical black-box adversarial attacks. To address these challenges, this paper proposes Joint Attention Distillation (JAD), a novel black-box attack framework leveraging Latent Diffusion Models (LDMs). JAD is the first to integrate LDMs into black-box adversarial generation; it jointly distills attention maps from both CNNs and Vision Transformers (ViTs) to identify architecture-agnostic sensitive regions, thereby guiding efficient synthesis of highly transferable adversarial examples. Crucially, JAD eliminates reliance on model-specific architectural assumptions, significantly reducing query overhead. Extensive experiments demonstrate that JAD achieves state-of-the-art cross-architecture transferability and generation efficiency against diverse target models—including ResNet, ViT, and DeiT—reducing average query cost by 37.2% and improving transfer success rates by 12.8–24.5%. This work establishes a new paradigm for efficient, general-purpose black-box adversarial attacks.

Technology Category

Application Category

📝 Abstract
Black-box adversarial attacks remain challenging due to limited access to model internals. Existing methods often depend on specific network architectures or require numerous queries, resulting in limited cross-architecture transferability and high query costs. To address these limitations, we propose JAD, a latent diffusion model framework for black-box adversarial attacks. JAD generates adversarial examples by leveraging a latent diffusion model guided by attention maps distilled from both a convolutional neural network (CNN) and a Vision Transformer (ViT) models. By focusing on image regions that are commonly sensitive across architectures, this approach crafts adversarial perturbations that transfer effectively between different model types. This joint attention distillation strategy enables JAD to be architecture-agnostic, achieving superior attack generalization across diverse models. Moreover, the generative nature of the diffusion framework yields high adversarial sample generation efficiency by reducing reliance on iterative queries. Experiments demonstrate that JAD offers improved attack generalization, generation efficiency, and cross-architecture transferability compared to existing methods, providing a promising and effective paradigm for black-box adversarial attacks.
Problem

Research questions and friction points this paper is trying to address.

Limited cross-architecture transferability in black-box adversarial attacks
High query costs due to iterative optimization requirements
Dependence on specific network architectures reduces attack generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages latent diffusion model with attention maps
Distills unified attention from CNN and ViT
Generates adversarial examples with reduced queries
🔎 Similar Papers
No similar papers found.
Y
Yang Li
School of Automation at Northwestern Polytechnical University, Xi’an, 710129, China
C
Chenyu Wang
School of Automation at Northwestern Polytechnical University, Xi’an, 710129, China
T
Tingrui Wang
School of Automation at Northwestern Polytechnical University, Xi’an, 710129, China
Yongwei Wang
Yongwei Wang
Zhejiang University
AI4MediaMultimedia ForensicsTrust Media
H
Haonan Li
School of Automation at Northwestern Polytechnical University, Xi’an, 710129, China
Z
Zhunga Liu
School of Automation at Northwestern Polytechnical University, Xi’an, 710129, China
Q
Quan Pan
School of Automation at Northwestern Polytechnical University, Xi’an, 710129, China