Data-Free Universal Attack by Exploiting the Intrinsic Vulnerability of Deep Models

📅 2025-03-28

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

To address the practical bottleneck of existing universal adversarial perturbation (UAP) generation methods—which heavily rely on large-scale training data—this paper proposes IntriUAP, the first fully data-free black-box UAP generation method. Methodologically, IntriUAP innovatively identifies the intrinsic vulnerability of deep neural networks (DNNs) as stemming from ill-conditioning (low-rankness and high condition number) in the linear components of unit-Lipschitz nonlinear layers. Leveraging this insight, it constructs a model-structure-only UAP generation paradigm: by performing singular value decomposition (SVD) on selected linear layers and aligning perturbations with dominant right singular vectors, it deliberately excites the most ill-conditioned directions. Evaluated on mainstream image classification models, IntriUAP achieves attack success rates competitive with state-of-the-art data-free methods. Notably, even when operating on only 50% of linear layers, its performance degrades by merely 4%, demonstrating exceptional robustness and practicality.

Technology Category

Application Category

📝 Abstract

Deep neural networks (DNNs) are susceptible to Universal Adversarial Perturbations (UAPs), which are instance agnostic perturbations that can deceive a target model across a wide range of samples. Unlike instance-specific adversarial examples, UAPs present a greater challenge as they must generalize across different samples and models. Generating UAPs typically requires access to numerous examples, which is a strong assumption in real-world tasks. In this paper, we propose a novel data-free method called Intrinsic UAP (IntriUAP), by exploiting the intrinsic vulnerabilities of deep models. We analyze a series of popular deep models composed of linear and nonlinear layers with a Lipschitz constant of 1, revealing that the vulnerability of these models is predominantly influenced by their linear components. Based on this observation, we leverage the ill-conditioned nature of the linear components by aligning the UAP with the right singular vectors corresponding to the maximum singular value of each linear layer. Remarkably, our method achieves highly competitive performance in attacking popular image classification deep models without using any image samples. We also evaluate the black-box attack performance of our method, showing that it matches the state-of-the-art baseline for data-free methods on models that conform to our theoretical framework. Beyond the data-free assumption, IntriUAP also operates under a weaker assumption, where the adversary only can access a few of the victim model's layers. Experiments demonstrate that the attack success rate decreases by only 4% when the adversary has access to just 50% of the linear layers in the victim model.

Problem

Research questions and friction points this paper is trying to address.

Exploits intrinsic vulnerabilities in deep models for attacks

Generates universal adversarial perturbations without data access

Analyzes linear layers' impact on model vulnerability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exploits intrinsic vulnerabilities in deep models

Aligns UAP with maximum singular value vectors

Operates without any image samples

🔎 Similar Papers

No similar papers found.