🤖 AI Summary
Image forgery localization models are highly vulnerable to adversarial perturbations, leading to severe degradation in localization accuracy. To address this, this work introduces the first adversarial defense framework for forgery localization, proposing a two-stage robust training paradigm: Forgery-Feature Alignment (FFA) and Mask-guided Refinement (MgR), which jointly balance perturbation generation accuracy and generalization. The method incorporates an Adversarial Noise Suppression Module (ANSM), channel-wise KL-divergence minimization, dual-mask constraints, and feature-distribution alignment. Extensive experiments demonstrate that our approach restores localization accuracy to the clean-input baseline under diverse adversarial attacks—including FGSM, PGD, and CW—while incurring negligible performance loss on unperturbed forged images. Notably, the framework maintains high fidelity and robustness without compromising detection sensitivity. To foster reproducibility and further research, we publicly release both the source code and a dedicated anti-forensics benchmark dataset.
📝 Abstract
Recent advances in deep learning have significantly propelled the development of image forgery localization. However, existing models remain highly vulnerable to adversarial attacks: imperceptible noise added to forged images can severely mislead these models. In this paper, we address this challenge with an Adversarial Noise Suppression Module (ANSM) that generate a defensive perturbation to suppress the attack effect of adversarial noise. We observe that forgery-relevant features extracted from adversarial and original forged images exhibit distinct distributions. To bridge this gap, we introduce Forgery-relevant Features Alignment (FFA) as a first-stage training strategy, which reduces distributional discrepancies by minimizing the channel-wise Kullback-Leibler divergence between these features. To further refine the defensive perturbation, we design a second-stage training strategy, termed Mask-guided Refinement (MgR), which incorporates a dual-mask constraint. MgR ensures that the perturbation remains effective for both adversarial and original forged images, recovering forgery localization accuracy to their original level. Extensive experiments across various attack algorithms demonstrate that our method significantly restores the forgery localization model's performance on adversarial images. Notably, when ANSM is applied to original forged images, the performance remains nearly unaffected. To our best knowledge, this is the first report of adversarial defense in image forgery localization tasks. We have released the source code and anti-forensics dataset.