CLUE: Leveraging Low-Rank Adaptation to Capture Latent Uncovered Evidence for Image Forgery Localization

๐Ÿ“… 2025-08-10
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the proliferation of forged images driven by generative AI, this paper proposes a novel image forgery localization method. Our approach uniquely leverages the inpainting pipeline of a state-of-the-art text-to-image modelโ€”Stable Diffusion 3โ€”as a forensic signal source, marking the first such utilization in digital image forensics. We design a LoRA-based forensic feature extractor to efficiently reconstruct its refinement flow (RF) mechanism. Furthermore, we integrate controlled noise injection with semantic context modeling and incorporate a parameter-efficiently fine-tuned SAM encoder to enhance boundary-aware localization. The resulting multimodal feature fusion framework achieves state-of-the-art performance across multiple benchmarks, significantly outperforming existing methods. Notably, it demonstrates strong robustness against common post-processing attacks and social media compression artifacts.

Technology Category

Application Category

๐Ÿ“ Abstract
The increasing accessibility of image editing tools and generative AI has led to a proliferation of visually convincing forgeries, compromising the authenticity of digital media. In this paper, in addition to leveraging distortions from conventional forgeries, we repurpose the mechanism of a state-of-the-art (SOTA) text-to-image synthesis model by exploiting its internal generative process, turning it into a high-fidelity forgery localization tool. To this end, we propose CLUE (Capture Latent Uncovered Evidence), a framework that employs Low- Rank Adaptation (LoRA) to parameter-efficiently reconfigure Stable Diffusion 3 (SD3) as a forensic feature extractor. Our approach begins with the strategic use of SD3's Rectified Flow (RF) mechanism to inject noise at varying intensities into the latent representation, thereby steering the LoRAtuned denoising process to amplify subtle statistical inconsistencies indicative of a forgery. To complement the latent analysis with high-level semantic context and precise spatial details, our method incorporates contextual features from the image encoder of the Segment Anything Model (SAM), which is parameter-efficiently adapted to better trace the boundaries of forged regions. Extensive evaluations demonstrate CLUE's SOTA generalization performance, significantly outperforming prior methods. Furthermore, CLUE shows superior robustness against common post-processing attacks and Online Social Networks (OSNs). Code is publicly available at https://github.com/SZAISEC/CLUE.
Problem

Research questions and friction points this paper is trying to address.

Detect image forgeries using AI-generated inconsistencies
Adapt Stable Diffusion 3 for forensic feature extraction
Improve forgery localization with semantic and spatial context
Innovation

Methods, ideas, or system contributions that make the work stand out.

LoRA adapts Stable Diffusion for forensics
Rectified Flow amplifies forgery inconsistencies
SAM encoder enhances boundary tracing
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Youqi Wang
Guangdong Provincial Key Laboratory of Intelligent Information Processing, Shenzhen University, China
Shunquan Tan
Shunquan Tan
Shenzhen MSU-BIT University
deep learningmachine learningmultimedia forensics.
Rongxuan Peng
Rongxuan Peng
Shenzhen University
Multimedia ForensicsReinforcement LearningAdversarial Attack and Defense
B
Bin Li
Faculty of Engineering, Shenzhen MSU-BIT University, China
Jiwu Huang
Jiwu Huang
Shenzhen MSU-BIT University
Multimedia forensics and security