Beyond Deceptive Flatness: Dual-Order Solution for Strengthening Adversarial Transferability

📅 2025-11-03

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Existing transferable adversarial attacks often get trapped in “deceptively flat regions”—regions that appear flat but are actually sharp—leading to degraded cross-model transferability. To address this, we propose a second-order information–based adversarial flatness optimization framework. First, we formally define Adversarial Flatness (AF) and establish theoretical guarantees for its efficacy. Second, we design the Adversarial Flatness-Aware (AFA) attack, which mitigates sign misalignment via dual-step gradient approximation and gradient alignment. Third, we introduce Monte Carlo Adversarial Sampling (MCAS) to enhance inner-loop efficiency. By integrating AF regularization with efficient sampling, our method significantly outperforms six baseline attacks on ImageNet-compatible datasets. It demonstrates superior transferability across diverse model architectures and real-world API-based black-box scenarios, while achieving measurably flatter loss landscapes.

Technology Category

Application Category

📝 Abstract

Transferable attacks generate adversarial examples on surrogate models to fool unknown victim models, posing real-world threats and growing research interest. Despite focusing on flat losses for transferable adversarial examples, recent studies still fall into suboptimal regions, especially the flat-yet-sharp areas, termed as deceptive flatness. In this paper, we introduce a novel black-box gradient-based transferable attack from a perspective of dual-order information. Specifically, we feasibly propose Adversarial Flatness (AF) to the deceptive flatness problem and a theoretical assurance for adversarial transferability. Based on this, using an efficient approximation of our objective, we instantiate our attack as Adversarial Flatness Attack (AFA), addressing the altered gradient sign issue. Additionally, to further improve the attack ability, we devise MonteCarlo Adversarial Sampling (MCAS) by enhancing the inner-loop sampling efficiency. The comprehensive results on ImageNet-compatible dataset demonstrate superiority over six baselines, generating adversarial examples in flatter regions and boosting transferability across model architectures. When tested on input transformation attacks or the Baidu Cloud API, our method outperforms baselines.

Problem

Research questions and friction points this paper is trying to address.

Addressing deceptive flatness in transferable adversarial attacks

Enhancing adversarial transferability across model architectures

Improving gradient-based black-box attacks with dual-order information

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-order information enhances adversarial transferability

Adversarial Flatness addresses deceptive flatness problem

MonteCarlo sampling improves inner-loop efficiency

🔎 Similar Papers

Enhancing Adversarial Transferability with Adversarial Weight Tuning