🤖 AI Summary
To address the challenge of simultaneously achieving high transferability and visual imperceptibility in adversarial attacks against black-box deepfake detectors, this paper proposes a dual-stream metric-aware adversarial attack framework. Methodologically, it introduces two complementary perturbation generation modules—MNTD-PGD and SG-PGD—that model global robust perturbations and local salient-region perturbations, respectively. A metric-aware candidate selection mechanism, grounded in SSIM-based perceptual fidelity and cross-model attack success rate, is integrated to jointly optimize transferability and visual quality. The key innovation lies in the first coupling of dual-stream gradient search with perception-driven candidate filtering, explicitly balancing attack efficacy and visual fidelity. Extensive evaluations on multiple unseen black-box detectors demonstrate that our method achieves an average misclassification rate improvement of 27%, significantly outperforming existing state-of-the-art approaches.
📝 Abstract
We present MS-GAGA (Metric-Selective Guided Adversarial Generation Attack), a two-stage framework for crafting transferable and visually imperceptible adversarial examples against deepfake detectors in black-box settings. In Stage 1, a dual-stream attack module generates adversarial candidates: MNTD-PGD applies enhanced gradient calculations optimized for small perturbation budgets, while SG-PGD focuses perturbations on visually salient regions. This complementary design expands the adversarial search space and improves transferability across unseen models. In Stage 2, a metric-aware selection module evaluates candidates based on both their success against black-box models and their structural similarity (SSIM) to the original image. By jointly optimizing transferability and imperceptibility, MS-GAGA achieves up to 27% higher misclassification rates on unseen detectors compared to state-of-the-art attacks.