Beyond Fully Supervised Pixel Annotations: Scribble-Driven Weakly-Supervised Framework for Image Manipulation Localization

📅 2025-07-17

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

Pixel-level annotations for image manipulation localization are prohibitively expensive, and existing weakly supervised methods suffer from performance degradation due to sparse supervision signals. Method: This paper proposes the first scribble-level weakly supervised framework, introducing a Prior-aware Feature Modulation Module (PFMM) and a Gated Adaptive Fusion Module (GAFM). It incorporates structural consistency self-supervision and a dynamic confidence-aware entropy minimization loss (ℒ<sub>CEM</sub>), augmented with multi-scale training for enhanced robustness. Contribution/Results: We construct the first publicly available scribble-annotated manipulation dataset. Our method achieves performance comparable to fully supervised approaches using only coarse scribble annotations—eliminating the need for dense pixel-level labels. It outperforms state-of-the-art fully supervised methods both in-distribution and out-of-distribution, significantly reducing annotation dependency while improving localization accuracy and generalization under complex scenarios.

Technology Category

Application Category

📝 Abstract

Deep learning-based image manipulation localization (IML) methods have achieved remarkable performance in recent years, but typically rely on large-scale pixel-level annotated datasets. To address the challenge of acquiring high-quality annotations, some recent weakly supervised methods utilize image-level labels to segment manipulated regions. However, the performance is still limited due to insufficient supervision signals. In this study, we explore a form of weak supervision that improves the annotation efficiency and detection performance, namely scribble annotation supervision. We re-annotated mainstream IML datasets with scribble labels and propose the first scribble-based IML (Sc-IML) dataset. Additionally, we propose the first scribble-based weakly supervised IML framework. Specifically, we employ self-supervised training with a structural consistency loss to encourage the model to produce consistent predictions under multi-scale and augmented inputs. In addition, we propose a prior-aware feature modulation module (PFMM) that adaptively integrates prior information from both manipulated and authentic regions for dynamic feature adjustment, further enhancing feature discriminability and prediction consistency in complex scenes. We also propose a gated adaptive fusion module (GAFM) that utilizes gating mechanisms to regulate information flow during feature fusion, guiding the model toward emphasizing potential tampered regions. Finally, we propose a confidence-aware entropy minimization loss (${mathcal{L}}_{ {CEM }}$). This loss dynamically regularizes predictions in weakly annotated or unlabeled regions based on model uncertainty, effectively suppressing unreliable predictions. Experimental results show that our method outperforms existing fully supervised approaches in terms of average performance both in-distribution and out-of-distribution.

Problem

Research questions and friction points this paper is trying to address.

Reducing reliance on pixel-level annotations for image manipulation localization

Improving weakly supervised IML with scribble annotations and novel modules

Enhancing detection performance in complex and out-of-distribution scenes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Scribble annotation supervision for efficient labeling

Prior-aware feature modulation for dynamic adjustment

Gated adaptive fusion to emphasize tampered regions

🔎 Similar Papers

Dense Feature Interaction Network for Image Inpainting Localization