ForensicsSAM: Toward Robust and Unified Image Forgery Detection and Localization Resisting to Adversarial Attack

📅 2025-08-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing parameter-efficient fine-tuning (PEFT)-based image forgery detection and localization (IFDL) methods lack adversarial robustness and are vulnerable to adversarial examples generated by upstream models. To address this, we propose ForensicsSAM—the first robust IFDL framework tailored for the PEFT paradigm—featuring three key innovations: (1) a forgery expert module to enhance semantic representation; (2) a lightweight RGB-domain adversarial detector for attack identification; and (3) an adaptively activated adversarial expert network that dynamically corrects feature shifts. ForensicsSAM is the first PEFT-based method enabling cross-model transferable adversarial defense while preserving clean-sample performance. Built upon the Segment Anything Model (SAM), it achieves state-of-the-art accuracy in both image-level detection and pixel-level localization across multiple benchmarks, and significantly improves robustness against both white-box and black-box adversarial attacks.

Technology Category

Application Category

📝 Abstract
Parameter-efficient fine-tuning (PEFT) has emerged as a popular strategy for adapting large vision foundation models, such as the Segment Anything Model (SAM) and LLaVA, to downstream tasks like image forgery detection and localization (IFDL). However, existing PEFT-based approaches overlook their vulnerability to adversarial attacks. In this paper, we show that highly transferable adversarial images can be crafted solely via the upstream model, without accessing the downstream model or training data, significantly degrading the IFDL performance. To address this, we propose ForensicsSAM, a unified IFDL framework with built-in adversarial robustness. Our design is guided by three key ideas: (1) To compensate for the lack of forgery-relevant knowledge in the frozen image encoder, we inject forgery experts into each transformer block to enhance its ability to capture forgery artifacts. These forgery experts are always activated and shared across any input images. (2) To detect adversarial images, we design an light-weight adversary detector that learns to capture structured, task-specific artifact in RGB domain, enabling reliable discrimination across various attack methods. (3) To resist adversarial attacks, we inject adversary experts into the global attention layers and MLP modules to progressively correct feature shifts induced by adversarial noise. These adversary experts are adaptively activated by the adversary detector, thereby avoiding unnecessary interference with clean images. Extensive experiments across multiple benchmarks demonstrate that ForensicsSAM achieves superior resistance to various adversarial attack methods, while also delivering state-of-the-art performance in image-level forgery detection and pixel-level forgery localization. The resource is available at https://github.com/siriusPRX/ForensicsSAM.
Problem

Research questions and friction points this paper is trying to address.

Detect and localize image forgeries resisting adversarial attacks
Enhance forgery artifact capture with injected forgery experts
Correct adversarial noise shifts via adaptive adversary experts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Inject forgery experts into transformer blocks
Design light-weight adversary detector
Inject adversary experts into attention layers
Rongxuan Peng
Rongxuan Peng
Shenzhen University
Multimedia ForensicsReinforcement LearningAdversarial Attack and Defense
Shunquan Tan
Shunquan Tan
Shenzhen MSU-BIT University
deep learningmachine learningmultimedia forensics.
C
Chenqi Kong
Rapid-Rich Object Search Lab, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
Anwei Luo
Anwei Luo
Jiangxi University of Finance and Economics
deepfakeface forgery detectionmultimedia securityforensics
A
Alex C. Kot
Guangdong Laboratory of Machine Perception and Intelligent Computing, Faculty of Engineering, Shenzhen MSU-BIT University, China
Jiwu Huang
Jiwu Huang
Shenzhen MSU-BIT University
Multimedia forensics and security