EDGER: EDge-Guided with HEatmap Refinement for Generalizable Image Forgery Localization

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the challenge of inaccurate forgery localization in cross-domain scenarios, particularly under text-guided high-fidelity image inpainting and arbitrary input resolutions. We propose EDGER, a patch-based dual-branch framework: an edge-guided segmentation branch enhances high-frequency inconsistencies at manipulation boundaries through frequency-domain edge detection, fusing RGB and edge features to produce pixel-level masks; a synthesis heatmap branch leverages CLIP-ViT with LoRA adapters to identify fully synthesized patches, yielding coarse-grained priors. The two branches operate synergistically to achieve resolution-agnostic, high-precision forgery localization. By jointly exploiting frequency-domain edge cues and patch-level synthesis priors—a first in the field—EDGER significantly improves cross-domain generalization and demonstrates strong performance on the MediaEval 2025 SynthIM Challenge, scaling effectively to multi-megapixel images.

📝 Abstract

Text-guided inpainting has made image forgery increasingly realistic, challenging both SID and IFL. However, existing methods often struggle to point out suspicious signals across domains. To address this problem, we propose EDGER, a patch-based, dual-branch framework that localizes manipulated regions in arbitrary resolution images without sacrificing native resolution. The first branch, Edge-Guided Segmentation, introduces a Frequency-based Edge Detector to emphasize high-frequency inconsistencies at manipulation boundaries, and fine-tunes a SegFormer to fuse RGB and edge features for pixel-level masks. Since edge evidence is most informative only when patches contain both authentic and manipulated pixels, we complement Edge-Guided Segmentation with a Synthetic Heatmapping branch, a classification-based localizer that fine-tunes a CLIP-ViT image encoder with LoRA to flag fully synthetic patches. Together, Synthetic Heatmapping provides coarse, patch-level synthetic priors, while Edge-Guided Segmentation sharpens boundaries within partially manipulated patches, yielding comprehensive localization. Evaluated in the MediaEval 2025, SynthIM challenge, Manipulated Region Localization Task's setting, our approach scales to multi-megapixel imagery and exhibits strong cross-domain generalization. Extensive ablations highlight the complementary roles of frequency-based edge cues and patch-level synthetic priors in driving accurate, resolution-agnostic localization.

Problem

Research questions and friction points this paper is trying to address.

Image Forgery Localization

Cross-domain Generalization

Text-guided Inpainting

Manipulated Region Detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Edge-Guided Segmentation

Synthetic Heatmapping

Frequency-based Edge Detection

Cross-domain Generalization